ArticlePDF Available

Use of Natural Language Processing to identify Obsessive Compulsive Symptoms in patients with schizophrenia, schizoaffective disorder or bipolar disorder

Authors:

Abstract

Obsessive and Compulsive Symptoms (OCS) or Obsessive Compulsive Disorder (OCD) in the context of schizophrenia or related disorders are of clinical importance as these are associated with a range of adverse outcomes. Natural Language Processing (NLP) applied to Electronic Health Records (EHRs) presents an opportunity to create large datasets to facilitate research in this area. This is a challenging endeavour however, because of the wide range of ways in which these symptoms are recorded, and the overlap of terms used to describe OCS with those used to describe other conditions. We developed an NLP algorithm to extract OCS information from a large mental healthcare EHR data resource at the South London and Maudsley NHS Foundation Trust using its Clinical Record Interactive Search (CRIS) facility. We extracted documents from individuals who had received a diagnosis of schizophrenia, schizoaffective disorder, or bipolar disorder. These text documents, annotated by human coders, were used for developing and refining the NLP algorithm (600 documents) with an additional set reserved for final validation (300 documents). The developed NLP algorithm utilized a rules-based approach to identify each of symptoms associated with OCS, and then combined them to determine the overall number of instances of OCS. After its implementation, the algorithm was shown to identify OCS with a precision and recall (with 95% confidence intervals) of 0.77 (0.65–0.86) and 0.67 (0.55–0.77) respectively. The development of this application demonstrated the potential to extract complex symptomatic data from mental healthcare EHRs using NLP to facilitate further analyses of these clinical symptoms and their relevance for prognosis and intervention response.
1
SCIENTIFIC REPORTS | (2019) 9:14146 | https://doi.org/10.1038/s41598-019-49165-2
www.nature.com/scientificreports
Use of Natural Language
Processing to identify Obsessive
Compulsive Symptoms in patients
with schizophrenia, schizoaective
disorder or bipolar disorder
David Chandran
1, Deborah Ahn Robbins1, Chin-Kuo Chang
2, Hitesh Shetty3, Jyoti Sanyal3,
Johnny Downs
1,3, Marcella Fok1, Michael Ball1, Richard Jackson1, Robert Stewart1,3,
Hannah Cohen1, Jentien M. Vermeulen4, Frederike Schirmbeck4,5, Lieuwe de Haan4,5 &
Richard Hayes1
Obsessive and Compulsive Symptoms (OCS) or Obsessive Compulsive Disorder (OCD) in the context
of schizophrenia or related disorders are of clinical importance as these are associated with a range of
adverse outcomes. Natural Language Processing (NLP) applied to Electronic Health Records (EHRs)
presents an opportunity to create large datasets to facilitate research in this area. This is a challenging
endeavour however, because of the wide range of ways in which these symptoms are recorded, and
the overlap of terms used to describe OCS with those used to describe other conditions. We developed
an NLP algorithm to extract OCS information from a large mental healthcare EHR data resource at the
South London and Maudsley NHS Foundation Trust using its Clinical Record Interactive Search (CRIS)
facility. We extracted documents from individuals who had received a diagnosis of schizophrenia,
schizoaective disorder, or bipolar disorder. These text documents, annotated by human coders, were
used for developing and rening the NLP algorithm (600 documents) with an additional set reserved
for nal validation (300 documents). The developed NLP algorithm utilized a rules-based approach to
identify each of symptoms associated with OCS, and then combined them to determine the overall
number of instances of OCS. After its implementation, the algorithm was shown to identify OCS
with a precision and recall (with 95% condence intervals) of 0.77 (0.65–0.86) and 0.67 (0.55–0.77)
respectively. The development of this application demonstrated the potential to extract complex
symptomatic data from mental healthcare EHRs using NLP to facilitate further analyses of these clinical
symptoms and their relevance for prognosis and intervention response.
e increasing use of electronic health records (EHRs) across health services provides opportunities for research
using real world data1. However, there are challenges to realizing the potential of EHRs in research. For example,
whilst some information is recorded in structured elds, oen useful contextual information, such as description
of symptoms, is embedded in free text2. Information from free text elds can be extracted and coded manually
to generate datasets which can be analyzed for research, but this is not feasible on a large scale. Consequently, the
advantages of conducting studies on a larger scale are lost: for example, the statistical power to look at rare expo-
sures or outcomes, and the ability to control for multiple potential confounders. An alternative to manually coding
free text elds is to develop automated approaches through the application of Natural Language Processing (NLP).
1Kings College London, Institute of Psychiatry, Psychology, and Neuroscience, London, United Kingdom. 2University
of Taipei, Department of Health and Welfare, Taipei City, Taiwan. 3South London and Maudsley NHS Foundation
Trust, London, United Kingdom. 4University of Amsterdam, Department of Psychiatry, Amsterdam, The Netherlands.
5Arkin Institute for Mental Health, Amsterdam, The Netherlands. Lieuwe de Haan and Richard Hayes contributed
equally. Correspondence and requests for materials should be addressed to D.C. (email: david.chandran@kcl.ac.uk)
Received: 5 September 2018
Accepted: 15 August 2019
Published: xx xx xxxx
OPEN
Content courtesy of Springer Nature, terms of use apply. Rights reserved
2
SCIENTIFIC REPORTS | (2019) 9:14146 | https://doi.org/10.1038/s41598-019-49165-2
www.nature.com/scientificreports
www.nature.com/scientificreports/
NLP approaches have been used previously to facilitate a number of studies using EHRs3, although the tech-
nique is still at a relatively early stage of application. NLP algorithms take into account the linguistic context
around words and phrases of interest and go beyond a simple keyword search of the text: for example, distin-
guishing between instances where a patient is described as experiencing a particular symptom from instances
where the texts states that the patient is not experiencing that symptom, or where it is someone else (e.g. a friend
or relative) who is experiencing that specic symptom. A key word count would not be capable of making such
distinctions. NLP applications can be developed using machine learning and rules based approaches, each with
its own advantages and drawbacks in relation to specic problems.
Machine learning in the context of NLP refers to an automated method of creating an NLP application4. It
involves an annotated set of training data being utilized by various algorithms to create models to classify future
documents5. e time taken for the algorithm to create this model and to apply the model to future issues varies
based on a wide range of factors; however, a machine learning approach takes substantially less time to develop
than a rules-based approach.
In contrast, rule based approaches involve a human coder manually analyzing training data and creating
rules based on their observations of the data6, these rules are then implemented programmatically. is can be a
challenging task, as the rules created need to be broad enough to ensure that they are applied to all the required
instances, but not excessively broad as might lead to the rules being incorrectly applied. An advantage to the rules
based approach is that it, arguably, allows for rules to be created for substantially more complex problems than
most machine learning algorithms would be able to solve (particularly with similarly sized datasets)7.
NLP applications can operate with a high degree of precision (positive predictive value) and recall (sensitivity)
although this can vary considerably between applications8. Variation in performance of NLP applications may
relate in part to the complexity of the task being undertaken9. As EHRs continue to be exploited for research, NLP
is being applied to increasingly subtle and complex tasks10.
In this paper, we describe the development of an NLP application for extracting data on obsessive compulsive
symptoms (OCS) and obsessive compulsive disorder (OCD) from free text in EHRs in patients with schizophre-
nia, schizoaective or bipolar disorders, using a rules-based approach. In the context of schizophrenia, OCS can
be dened as persistent, repetitive, intrusive, and distressful thoughts (obsessions) not related to the patient’s
delusions or repetitive, goal-directed rituals (compulsions) clinically distinguishable from schizophrenic man-
nerisms or posturing11. For the purposes of this investigation we dene OCD patients as a subset of OCS patients
where a clinician has recognised that these symptoms are of sucient severity, duration, or cause sucient func-
tional impairment that a clinical diagnosis of a disorder is given. OCS are relatively common in schizophrenia or
related disorders and are associated with depressive symptoms and poorer functioning12. ere are a number of
challenges inherent in identifying co-morbid OCS from clinical record for this patient group. For example, the
terms used to describe OCS (such as obsession and compulsion) are not specic to these symptoms and these
symptoms need to be distinguished from other aspects of these disorders. is study provides an example of
the development of an NLP algorithm for a relatively complex task using a large psychiatric EHRs database. We
describe obstacles and solutions.
Methods
Setting. e data used to develop the NLP algorithm for extracting OCS were obtained from the South
London and Maudsley NHS Foundation Trust (SLaM) which is a near-monopoly secondary mental healthcare
service provider to 1.36 million residents in four boroughs of south London (Croydon, Lambeth, Lewisham
and Southwark), as well as providing some national tertiary mental healthcare services. e SLaM Biomedical
Research Centre (BRC), supported by National Institute for Health Research funding, provides anonymised elec-
tronic clinical records from the SLaM Case Register for research purposes through the BRC Clinical Record
Interactive Search (CRIS) system. e CRIS system was developed in 2008 and accesses full EHRs across all SLaM
services since 2007, including both structured and open-text elds, currently on more than 300,000 service users.
A detailed description of CRIS and its development is described elsewhere7.
Ethics statement. e CRIS data resource received appropriate research ethics approval as a de-identied
database for secondary analyses from Oxford Research Ethics Committee C (reference 08/H0606/71 + 5) and the
authors can conrm that the study presented here was performed in accordance with the guidelines and regula-
tions set out in this approval.
Inclusion criteria. To develop and test the OCS NLP algorithm, data extracts were obtained for individuals
who were aged 15 years or older at the time of their rst severe mental illness (SMI) diagnosis date within the
observation period (from 1 January 2007 to 31 December 2015) and who had received a diagnosis (ICD-10 code)
of schizophrenia (F20), schizoaective disorder (F25), or bipolar disorder (F31) during the observation period.
Diagnoses were obtained from structured elds and also from unstructured free text using a previously validated
NLP algorithm, described elsewhere13.
Denition of OCS. As mentioned above, in this study OCS were dened according to the Structured Clinical
Interview for DSM Disorders–Patient (SCID-P)14 as ‘persistent, repetitive, intrusive, and distressful thoughts
(obsessions) not related to the patient’s delusions, or repetitive, goal-directed rituals (compulsions) clinically distin-
guishable from schizophrenic mannerisms or posturing”. As such, individuals whose obsessional thoughts or com-
pulsions were related to psychotic content of thoughts or delusions were not considered to have comorbid OCS9.
Extracting data for training and validation of the algorithm. Data were extracted from EHRs for
training and development of the algorithm from those individuals who met the inclusion criteria. To avoid read-
ing and coding a substantial volume of unrelated documents we applied a lter such that we only extracted
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3
SCIENTIFIC REPORTS | (2019) 9:14146 | https://doi.org/10.1038/s41598-019-49165-2
www.nature.com/scientificreports
www.nature.com/scientificreports/
documents containing specic key terms. Although, once developed, an NLP algorithm can be substantially more
sophisticated than a key word search, the development process may include keyword searchers. In this instance
a set of key terms were selected which were potentially broad enough to cover all the records that mentioned
OCS. e Yale Brown Obsessive Compulsive Scale (Y-BOCS)15 was used as a guide to select these key words. e
following key words terms (as shown in table1) were used to lter the EHRs:
• OCD (and variations such as O.C.D)
• Obsess (and variations such as “obsessional” and “obsessive
• Compulsive (and variations such as “compulsion”, “compulsiveness” and “compelled”)
• Ritual (and variations such as “ritualistic”)
• Hoard (and variations such as “hoarding” or “hoarded”)
• e presence of any of the YBOCS key terms.
• e presence of any of the Patient Insight Key terms.
rough applying the lter, a random sample of 900 documents that contained at least one of the terms shown
in Table1 (including patient notes and correspondence), with one document per unique patient, were randomly
extracted from the anonymised EHRs. is sample was then divided into a training set (600 documents) and a
validation set (300 documents). ese documents contained multiple instances of references to OCS, with each
document containing at least one instance, with no upper limit. Text strings around each key word (described
above) were extracted from these documents. Each text string included the keyword and the sentence which
contained this key word plus two sentences either side of the key word sentence. is was to ensure that any
contextual information contained in the surrounding sentences could be incorporated into the NLP algorithm.
In some instances, the text strings comprised fewer than ve sentences due to there being less than two sentences
before and/or aer the keyword sentence in that particular document.
Developing manual coding rules. e training and validation sets of documents were then manually
coded according to a predetermined set of manual coding rules which were developed using the Y-BOCS as a
guide. An approach taken when developing the manual coding rules for identifying OCS in text strings has been
outlined in appendices A and B. For example, if the text mentioned that the patient had both obsessions and
compulsions, then the patient was classied as having OCS. However, if the text only mentioned obsessions or
compulsions (but not both terms) this was only considered OCS if the text also listed specic examples found in
the Y-BOCS, such as checking or cleaning, or described intrusive, ego-dystonic thoughts. ere were a number
of reasons for this conservative approach: rstly, clinical text may be produced by a range of dierent health pro-
fessionals or may describe a patient’s belief about themselves and secondly, the terms obsession or compulsion are
used in a wide range of contexts beyond OCS.
Annotating training and validation data sets. e training dataset consisted of 600 documents (con-
taining at least one of the key words) which were manually annotated by two annotators (DA, RH) individually
annotating each of the records. Aer the annotations were completed, the results were compared, and individual
points of disagreement were identied. To resolve these points of disagreement, a discussion occurred between
the two annotators, under the supervision on an arbitrator (DC) to ensure that the process did not give one of
the annotators an undue level of input. Inter-annotator reliability between the two annotators produced observed
agreement of 92.0% (Cohens κ of 0.80), indicating good inter-annotator agreement in determining the OCS
coding rules.
Development of an NLP algorithm for extracting OCS. e training data were used to create classi-
cation rules needed to build the algorithm. e algorithm was developed using Generalized Architecture for Text
Engineering (GATE). which includes a suite of tools for the development of NLP rules which are based on JAPE.
JAPE is a unique, Java based, NLP scripting language that is native to GATE. It allows users to generate rules with
OCS Keywords YBOCS Keywords Patient Insight Keywords
Obses* (Includes variations such as
‘obsessive’ and ‘obsessional’) Clean* (Includes variations such as ‘cleaned’ or ‘cleanliness’) Distres* (Includes variations such
as distressed or distressing)
Compul* Includes varations such
as Compulsive or compulsively, but
specically excluding “compulsory”. Wa sh* (Includes variations such as washing or washed) Unwanted
OCD* (Includes variations such as
OCD and O.C.D) Check* (Includes variation such as checking and checked) Repugnant
Hoard* (Includes varations such as
Hoarding and Hoarded) Repeat* (Includes varations such as repeatedly or repetitive) Repulsive
Ritual* (Includes variations such as
‘ritualistic’ and ‘ritually’) Count* (Includes variations such as counted or counting) Egodystonic
Order* (Includes variations such as ordered or ordering) Intrusive
Counting Intruding
Rearrange*(Includes variations such as rearranging or rearranged) Unable to stop
Table 1. Key modier words used in the natural language processing application for OCS.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
4
SCIENTIFIC REPORTS | (2019) 9:14146 | https://doi.org/10.1038/s41598-019-49165-2
www.nature.com/scientificreports
www.nature.com/scientificreports/
very high levels of complexity. e manual coding rules which had been applied to annotate the training set were
combined with observations of the annotated training set data to create a set of broad rules in JAPE which were
then integrated into the application. is involved developing sets of exclusion rules and inclusion rules. Inclusion
rules, determined the patterns of text required for an instance to be classed as positive, in the absence of exclusion
rules. Exclusion rules used sets of exclusion terms, which would lead to an instance being classed as a negative
(these are terms such as negations, or experiencers that are not the patient themselves). e algorithm involved
the following steps (Table2 contains terms that were used in the exclusion of terms as described in steps 4–8)
1. Splitting the text on a sentence by sentence level
2. Finding the presence of a possible OCS reference (in the context of the particular app) within the text.
3. Check for a combination of terms that would indicate an instance of OCS, as in the context of the particu-
lar app (e.g. the Hoarding app identies an instance of Hoarding as an OCS symptom).
4. Exclude all instances wherein the text was characteristic of prompt questions within a clinical question-
naire. Specically, the algorithm identied all combinations of words and punctuation that were unique
to forms (which were determined through analysis of the training data). If any instances contained any of
these combinations, they were excluded. In the context of the extracted data, there were very few cases of
these.
5. Exclude any instances wherein the sentence contains any negating terms. Each of the ve apps had a specif-
ic set of negating terms. rough an examination of the training data, a list of negating words and phrases
were determined. Any instance that contained any of these words and phrases were excluded.
6. Exclude any instances wherein terms referring to experiencers who were not the subject (such as terms
referring to family members or friends), appeared in the sentence. is was done through the determining
a list of terms that could refer to an individual other than the patient (including terms for family members
or friends or romantic partners). Any instance that contained one of these instances was excluded.
7. Exclude any instances where there were references to uncertainty about the diagnosis (as the aim of the
application was to identify denite instances). is was done through creating a list of hedge words and
excluding any instance that contained any of the hedge words.
8. Exclude instances of self-diagnosis (where the text indicates the patient diagnosed themselves with OCD).
is was done through examining the training data, nding terms that were used in cases where self-diag-
nosis occurred and excluding those instances.
We included lexical variances in the extraction rules [i.e. acronyms (e.g. OCD), misspellings (e.g. obses*in-
stead of instead of obsessive)]. We took into account semantic variants in the terms of obsessive and compulsive
in the extraction because, in so far as, these may have alternative meanings beyond their denitions in the context
of OCD/OCS. is was done by distinguishing between the dierent examples of the obsessions and compulsions
provided in the text. rough application of these rules, records were classied.
Validation of an NLP App for extracting OCS. e validation dataset was used as a nal test of how
well the algorithm performed compared to manually coded data providing an indication of how well this algo-
rithm would perform across the remaining 300,000 plus patient records on the CRIS system. To ensure that
there was no information bias in the development of the application, the validation data remained unseen by
the NLP developer (DC) throughout the App development process, until it was utilized to test the nal version
of the OCS algorithm. e accuracy of the OCS algorithm was evaluated using measurements of precision (i.e.
positive predictive value) and recall (i.e. sensitivity) at the instance level. Precision was measured as the propor-
tion of positive OCS instances identied by the NLP application tool that were correct according to the manual
annotations of these same documents; recall was measured as the proportion of OCS instances in the documents
(based on manual annotations) that were correctly identied by the NLP application tool. e development of
Form Negation Other Experiencer Self-Description Hedge
c - obsessive compulsive None Mother/Father Self-Described Seem(s)
hoarded materials blocking
passages deny* (includes variations
such as denied and denying) Sister/Brother He/She describe(s/d) Possible* (Including variations
such as possibility/and possibly
obsessions and
compulsions. none Nil Parent Described Him/Herself Apparent(ly)
Obsessive Compulsive
Index (including variations
such as o.c.i, oci)
no(t) obses* (includes
variations such as obsessed,
obsessions and obsessional) Son/Daughter Say(s) that Sound(s) like
than (an) obses* (includes
variations such as obsessed,
obsessions and obsessional) Sibling told me
No Histo ry Family
No Evidence Boy/Girlfriend
Partner
Husband/Wife
qqqqq (a pseudonym for a
family member or carer)
Table 2. List of terms used to identify candidates for exclusion.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
5
SCIENTIFIC REPORTS | (2019) 9:14146 | https://doi.org/10.1038/s41598-019-49165-2
www.nature.com/scientificreports
www.nature.com/scientificreports/
the NLP application aimed to maximize precision in order to reduce the likelihood of false positive results. is
NLP-based OCS application was then applied across the entire SLaM Case Register. Finally, for research purposes
the OCS algorithm was combined with data from a pre-existing diagnosis algorithm and information on diagno-
sis from structured elds. Overall precision and recall produced by combining these approaches were calculated.
e 95% condence intervals were calculated using the exact binomial method16, which were calculated using
the Stata soware package.
Results
Initially, a machine learning approach was trialed to develop an OCS NLP application because this could be devel-
oped and deployed more rapidly. is application was developed using TextHunter which is a bespoke piece of
soware developed at the SLaM BRC that allows for the fast creation and deployment of Machine Learning appli-
cations based on an annotated training set and gold standard. TextHunter utilizes an Support Vector Machine
(SVM) approach to building an ML model. SVM development is seamlessly integrated into GATE, allowing for its
smooth and rapid implementation. is approach had allowed for the creation of a wide range of successful appli-
cations using TextHunter. TextHunter develops models using a wide range for features and parameters, deter-
mining which one returns the highest Precision, Recall and F1 scores. However, the performance this approach
was judged to be insucient (attaining a precision of 0.74 and a recall of 0.51). It is possible that ML could return
better results if a Deep Learning approach and been taken. However, while this was considered, its operation was
determined to be too computationally and time intensive to be practical for the project. It does however remain
an option for future exploration of the topic.
Examination of the text strings in the training dataset indicated that there was diversity in the words and
phrases surrounding the key words in the training data set. Comparing the annotations generated by the algo-
rithm to manual coding of the same training data indicated that the algorithm was not able to perform at a
satisfactory level and that another approach was needed. It was noted that each of the individual keywords could
form a potential application of its own. erefore, we decided to build ve separate component algorithms and
combine them into a single functional OCS algorithm. is involved running each of the subset apps over the
training data separately. e broad annotation rules that had been developed earlier were modied to suit each
of the ve component algorithms, individually. A decision was made, not to articially use an equal number of
keywords for each sub app, to ensure that the sample would be representative of overall data set with respect to
the prevalence of those keywords.
e performance (precision and recall) of individual components of the OCS algorithm in the validation
set and the performance overall across all text strings are described in Table3. From the algorithm components
themselves, performances of over 0.7 were observed in terms of precision except for Compulsions, Similarly,
recall of over 0.8 was observed in each algorithm except for Ritual and Obsessions. In the specic context of OCD,
the OCD component of the algorithm returned a precision of 1 and a recall of 0.85. e component algorithms
were used together to detect the presence any OCS, (including OCD)–i.e. any text strings which described one
or more of the following: obsessions, compulsion, OCD, hoarding, rituals–in accordance with the coding rules
outlined in Table4. e precision and recall (with 95% condence intervals) for detecting any OCS, (including
OCD) were 0.77 (0.65–0.86) and 0.67 (0.55–0.77) respectively.
Discussion
ese results highlight that it is feasible to develop a tool to address a relatively complex data classication task in
electronic health records using NLP. e algorithm we developed was able to identify co-morbid OCS in people
diagnosed with schizophrenia, schizoaective disorder or bipolar disorder with acceptable precision and recall,
particularly in terms of the ability to identify OCD, as compared to existing algorithms that have been developed
for CRIS (some of which are illustrated in a previous publication7) which showed a range of precisions between
0.93 to 0.97 and a range of recalls between 0.59 to 0.99.
Identifying OCS using clinical records is arguably a comparatively challenging task for an NLP application
for a number of reasons. Unlike many other health constructs (e.g. hypertension) OCS are oen described using
terms which have a wide range of usage in both specialized and lay contexts. For example, ‘obsession’ may be
used to describe nothing more than a keen interest, or ‘compulsion’ to describe the desire to engage in risk taking
behaviours, such as gambling, neither of which would qualify as OCS. Also, these symptoms can manifest in a
variety of ways and in many cases, may not be the primary concern of the clinicians writing clinical notes and cor-
respondence. Moreover, OCS need to be clinically distinguishable from schizophrenic mannerisms or posturing
Symptom Precision Recall
Obsessions 0.73 0.5
Compulsions 0.63 0.83
OCD 1 0.85
Hoard 0.73 0.81
Ritual 1 0.33
Any OCS (including OCD) 0.77 0.67
Table 3. Performance of individual components of the OCS algorithm in the validation set (300 documents)
and the performance overall for detecting any OCS (including OCD) across all strings with Precision (positive
predictive value) and recall (sensitivity) provided.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
6
SCIENTIFIC REPORTS | (2019) 9:14146 | https://doi.org/10.1038/s41598-019-49165-2
www.nature.com/scientificreports
www.nature.com/scientificreports/
or other psychosis related repetitive thoughts or behaviour10 consequently the algorithm needed to be able to use
information presented in the free text to make these distinctions.
To the best of our knowledge, this is the rst time an algorithm has been developed to extract OCS from free
text using NLP. However, NLP or other algorithms have been used to extract other types of information from
clinical records. Examples of this are applications to nd the presence of cognitive behavioural therapy (CBT)
delivery17, adverse drug eects (ADR)18 and antipsychotic polypharmacy data13.
A key strength of the OCS algorithm was its ability to successfully identify obsessive and compulsive symp-
toms in free text, with a reasonable level of precision despite the complexity of these texts and the variety of way
clinicians refer to these symptoms. e performance of this algorithm improved when divided into component
algorithms and modied accordingly, performing particularly well at identifying OCD. e key limitation was the
lower recall, which created a risk of underestimating cases of OCS. In the development of this algorithm, precision
was considered more important than recall, i.e. false positives were considered to be a more important issue to
avoid than false negatives. e results presented here are for instances in the text describing OCS (including OCD).
In practice this NLP algorithm would be applied to classify patients where there are likely to be multiple instances
describing the same symptoms. Consequently, OCS instances that are missed (due to the lower recall) may be less
important because there are likely to be other instances for the same patient which the algorithm will detect.
To develop this algorithm, we undertook a rules-based rather aer initially trying a machine learning approach.
A rules based approach has its own limitations. Firstly, it is a substantially more time-consuming approach than a
machine learning one19. is is because it takes a coder far longer to identify and code rules than for an automated
system to construct a model. Furthermore, a rules-based approach is far more vulnerable to a coder’s subjectivity
and personal biases than a machine learning one. is is a particularly key issue given the importance in NLP in
understanding the intent of the text that is analyzed. By comparison, the downside to a machine learning approach
is that its ecacy decreases in relation to the complexity of the task which may mean the algorithm is unable to
perform well or that the algorithm would require an increasingly large set of training data (increasing the time
taken to process that data and create the application and the time required to develop the training data)4.
ere are a number of avenues for further development. One approach is to endeavour to improve algorithm
performance by increasing the size of the training dataset providing further examples of positive and negative
instances which would allow new rules to be developed. A further point of development is identifying the tem-
porality around positive mentioned OCS, which is not currently determined. is is an important challenge, as
there have not been any previously developed methods for determining (from unstructured text), if the subject of
a record currently has OCS or if they had them in the past (and if so how far in the past). erefore, future work
will involve adding a temporality component. In addition, the work that was done was limited through using a
single corpus (CRIS), future work will involve using the algorithm over the text data of clinical records from other
trusts, which will give an indication of the algorithms generalizability.
Negative for OCS
Text makes no mention of OCS
Text states that patient does not have OCS
Text states that patient has either compulsions or obsessions, not both, and there is no information
about any of the following:
Patient Distress
Obsessive or Compulsive symptoms described as egodystonic
Inability to stop Obsesions or Compulsions
Description of specic compulsions or specic obsessions
Patient Insight
Text states that non-clinician observers (patient or family/friends) believe patient has obsessions
or compulsions without describing YBOCS symptoms
Text includes hedge words -i.e., possibly, apparently, seems -that specically refer to OCS keywords
Text includes risky, risk-taking, or self-harm behaviours
Text includes romantic or weight-related (food related) words that modify OCS Keywords
Positive for OCS
Text states that patient has OCD features/OCD Symptoms
Text states that patient has OCS
Text includes hoarding, which was considered part of OCS, regardless of presence or absence of specic
examples
Text states that patient has at least 2 of the OCS keywords
Text states that patient has either obsessive or compulsive or rituals or YBOCS and one of the following:
Obsessions or Compulsions are described as egodystonic
Intrusive, cause patient distress or excessive worrying/anxiety
Patient feels unable to stop obsessions or compulsions
Patient recognizes symptoms are irrational or senseless
Clinician provides specic YBOCS symptoms
Text reports that patient has been diagnosed with OCD by clinician
Table 4. Manual annotation rules for OCS.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
7
SCIENTIFIC REPORTS | (2019) 9:14146 | https://doi.org/10.1038/s41598-019-49165-2
www.nature.com/scientificreports
www.nature.com/scientificreports/
References
1. Coorevits, P. et al. Electronic health records: new opportunities for clinical research. J. internal medicine 274, 547–560 (2013).
2. Stewart, . et al. The south london and maudsley nhs foundation trust biomedical research centre (slam brc) case register:
development and descriptive data. BMC psychiatry 9, 51 (2009).
3. Niiforou, A., Ponirou, P. & Diomidous, M. Medical data analysis and coding using natural language processing techniques in order
to derive structured data information. In ICIMTH, 53–55 (2013).
4. Sebastiani, F. Machine learning in automated text categorization. ACM computing surveys (CSUR) 34, 1–47 (2002).
5. Wu, F. & Weld, D. S. Open information extraction using wiipedia. In Proceedings of the 48th annual meeting of the association for
computational linguistics, 118–127 (Association for Computational Linguistics, 2010).
6. Winograd, T. Understanding natural language. Cogn. psychology 3, 1–191 (1972).
7. Perera, G. et al. Cohort prole of the south london and maudsley nhs foundation trust biomedical research centre (slam brc) case
register: current status and recent enhancement of an electronic mental health record-derived data resource. BMJ open 6, e008721
(2016).
8. Jones, . S. & Galliers, J. . Evaluating natural language processing systems: An analysis and review, vol. 1083 (Springer Science &
Business Media, 1995).
9. Hripcsa, G. et al. Unlocing clinical data from narrative reports: a study of natural language processing. Annals internal medicine
122, 681–688 (1995).
10. Meystre, S. M., Savova, G. ., ipper-Schuler, . C. & Hurdle, J. F. Ext racting information from textual documents in the electronic
health record: a review of recent research. Yearb. medical infor matics 17, 128–144 (2008).
11. de Haan, L., Hoogenboom, B., Beu, N., van Amelsvoort, T. & Linszen, D. Obsessive-compulsive symptoms and positive, negative,
and depressive symptoms in patients with recent-onset schizophrenic disorders. e Can. J. Psychiatry 50, 519–524 (2005).
12. de Haan, L., Ster, B., Wouters, L. & Linszen, D. H. e 5-year course of obsessive-compulsive symptoms and obsessive-compulsive
disorder in rst-episode schizophrenia and related disorders. Schizophr. bulletin 39, 151–160 (2011).
13. adra, G. et al. Extracting antipsychotic polypharmacy data from electronic health records: developing and evaluating a novel
process. BMC psychiatry 15, 166 (2015).
14. First, M. B. et al. Structured clinical interview for dsm-iv-tr axis i disorders, research version, patient edition. Tech. Rep., SCID-I/P
(2002).
15. Steetee, G., Frost, . & Bogart, . e yale-brown obsessive compulsive scale: Interview versus self-report. Behav. Res. er. 34,
675–684 (1996).
16. Clopper, C. J. & Pearson, E. S. e use of condence or ducial limits illustrated in the case of the binomial. Biom. 404–413 (1934).
17. Colling, C. et al. Identication of the delivery of cognitive behavioural therapy for psychosis (cbtp) using a cross-sectional sample
from electronic health records and open-text information in a large u-based mental health case register. BMJ open 7, e015297
(2017).
18. Iqbal, E. et al. Identication of adverse drug events from free text electronic patient records and information in a large mental health
case register. PloS one 10, e0134208 (2015).
19. Myowieca, A., Marcinia, M. & upść, A. ule-based information extraction from patients’ clinical data. J. biomedical informatics
42, 923–936 (2009).
Acknowledgements
is work was supported by the Clinical Record Interactive Search (CRIS) system funded and developed by
the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley
NHS Foundation Trust and King’s College London and a joint infrastructure grant from Guy’s and St omas
Charity and the Maudsley Charity (grant number BRC-2011-10035). We appreciated the technical support
from informatics personnel in the Biomedical Research Centre. For part of the time spent on this project, RH
was funded by a Medical Research Council (MRC) Population Health Scientist Fellowship (grant number MR/
J01219X/1). David Chandran, Chin-Kuo Chang, Deborah Ahn, Hitesh Shetty, Jyoti Sanyal, Robert Stewart
and Richard Hayes have all received salary support from the National Institute for Health Research (NIHR)
Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London.
e authors would like to gratefully acknowledge the assistance and advice rendered by Sumithra Velupillai,
Angus Roberts and Mizanur Khondoker of Kings College London. e views expressed are those of the author(s)
and not necessarily those of the NHS, the NIHR or the Department of Health.
Author Contributions
e project was conceived of by R.H. and L.d.H. e development of the application was led by D.C. D.A.R. and
M.F. contributed substantially to the implementation of the application. Work on extraction of datasets from
CRIS was done by H.S. and J.S. Earlier work on a M.L. approach was done by R.J. and M.B. e aforementioned
authors as well as R.S., C.-K.C., J.D., H.C., J.M.V. and F.S. all contributed substantially to the initial writing of the
paper and revisions that were subsequently made to it.
Additional Information
Competing Interests: R.H., R.S., R.J., H.S. and C.-K.C. have received research funding from Roche, Pzer,
Janssen and Lundbeck.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Cre-
ative Commons license, and indicate if changes were made. e images or other third party material in this
article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons license and your intended use is not per-
mitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the
copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
© e Author(s) 2019
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... This could mean that the false positives are often minimized. This is supported by various authors' explicitly indicating to optimize precision (Wu et al, 2013;Kadra et al, 2015;Iqbal et al, 2015;Colling et al, 2017;Ohno-Machado and Séroussi, 2019;Chandran et al, 2019;Haerian et al, 2012;Chilman et al, 2021), whereas only a few authors favour recall (Carson et al, 2019;Zhang et al, 2017;Viani et al, 2021). In some cases, the preference for Table 2: An overview of the machine learning methods with a reported F1-, precision-or recall score, sorted by F1-score. ...
... Haerian et al, 2012;Anderson et al, 2015;Castro et al, 2015;Iqbal et al, 2015;Patel et al, 2015;Gorrell et al, 2016;Geraci et al, 2017;Colling et al, 2017; Downs et al, 2017b,a;Scheurwegs et al, 2017;Tran and Kavuluru, 2017;Dai and Jonnagaddala, 2018;Fernandes et al, 2018;Carson et al, 2019;Chandran et al, 2019;Downs et al, 2019;Ohno-Machado and Séroussi, 2019;Zhu et al, 2020;Gagnon et al, 2020;Chen et al, 2018;Cusick et al, 2021;Topaz et al, 2021), events(Hammond et al, 2015), identify clinically relevant new information(Zhang et al, 2017), occupations(Chilman et al, 2021), drug information(Kadra et al, 2015;Hayes et al, 2015) or smoking detection(Wu et al, 2013). Additionally, we found the following prediction tasks in this review: predicting treatment outcome(Perlis et al, 2012;Rumshisky et al, 2016;Colling et al, 2020), patient readmission (Alvarez-Mellado et al, 2019) or patient violence(Menger et al, 2018(Menger et al, , 2019Mosteiro et al, 2021). ...
Preprint
Full-text available
Throughout the history of artificial intelligence, various algorithm branches have predominantly been used at different times. The last decade has been characterized by a shift from rule-based methods to self-learning methods. However, while the shift towards using ML methods is evident, there is no comparison of both methods for document classification. This systematic literature review focuses on the document classification in healthcare notes from electronic health records within psychiatry. We assess how these methods compare to each other in terms of classification performance and how they have developed throughout time, and we discuss potential directions of the field. We find that rule-based methods have had a higher performance for most of the last decade than machine-learning methods.Yet, the shift in representation techniques and algorithms used in recent years resulted in machine learning methods performing better.Dense document representation techniques, with mostly non-zero cells, outperform sparse representation techniques, with mostly zeros. Also, many neural networks outperform other self-learning- and rule-based methods. We find that state-of-the-art language models are barely employed in the psychiatric domain and expect an increase in the application of federated learning can increase the data availability for model training.
... 22 NLP has already created opportunities to analyse large textual data sets and can now accurately detect mentions of complex phenomena such as suicidality [23][24][25][26] and obsessive compulsive symptoms. 27 This study seeks to answer the question of whether an NLP application can derive information on the similarly complex and broad construct of adolescent mental health patient online activity. This will have implication for researchers wishing to undertake large-scale epidemiological research as well as clinicians who could use this personalised data to inform patient care. ...
Article
Full-text available
Objectives To assess the feasibility of using a natural language processing (NLP) application for extraction of free-text online activity mentions in adolescent mental health patient electronic health records (EHRs). Setting The Clinical Records Interactive Search system allows detailed research based on deidentified EHRs from the South London and Maudsley NHS Foundation Trust, a large south London Mental Health Trust providing secondary and tertiary mental healthcare. Participants and methods We developed a gazetteer of online activity terms and annotation guidelines, from 5480 clinical notes (200 adolescents, aged 11–17 years) receiving specialist mental healthcare. The preprocessing and manual curation steps of this real-world data set allowed development of a rule-based NLP application to automate identification of online activity (internet, social media, online gaming) mentions in EHRs. The context of each mention was also recorded manually as: supportive, detrimental or neutral in a subset of data for additional analysis. Results The NLP application performed with good precision (0.97) and recall (0.94) for identification of online activity mentions. Preliminary analyses found 34% of online activity mentions were considered to have been documented within a supportive context for the young person, 38% detrimental and 28% neutral. Conclusion Our results provide an important example of a rule-based NLP methodology to accurately identify online activity recording in EHRs, enabling researchers to now investigate associations with a range of adolescent mental health outcomes.
... Analyzing unstructured PRO data requires novel pipelines and methods. One approach is NLP, which uses AI to extract and annotate unstructured PRO data and identify features, such as specific types and attributes of symptoms [93]. ML, typically neural network techniques, is useful for examining associations of PRO features with clinical outcomes [94]. ...
Article
Full-text available
Patient-reported outcome measures (PROMs) are subjective assessments of health status or health-related quality of life. In childhood cancer survivors, PROMs can be used to evaluate the adverse effects of cancer treatment and guide cancer survivorship care. However, there are barriers to integrating PROMs into clinical practice, such as constraints in clinical validity, meaningful interpretation, and technology-enabled administration of the measures. This article discusses these barriers and proposes 10 important considerations for appropriate PROM integration into clinical care for choosing the right measure (considering the purpose of using a PROM, health profile vs. health preference approaches, measurement properties), ensuring survivors complete the PROMs (data collection method, data collection frequency, survivor capacity, self- vs. proxy reports), interpreting the results (scoring methods, clinical meaning and interpretability), and selecting a strategy for clinical response (integration into the clinical workflow). An example framework for integrating novel patient-reported outcome (PRO) data collection into the clinical workflow for childhood cancer survivorship care is also discussed. As we continuously improve the clinical validity of PROMs and address implementation barriers, routine PRO assessment and monitoring in pediatric cancer survivorship offer opportunities to facilitate clinical decision making and improve the quality of survivorship care.
... Recently, NLP applications have been extended to unstructured PRO and symptom data stored in electronic medical records (EMRs) [20,21]. A review study [22] found that most previous NLP applications for unstructured PRO data largely focused on rule-based classifications (eg, extracting prespecified keywords or phrases from free text to identify cancer-related symptoms [23]), followed by machine learning (ML) approach (eg, conditional random field model [20], support vector machine [SVM] [24], and boosting regression tree [25]) to analyze associations with clinical outcomes. ...
Preprint
BACKGROUND Assessing patient-reported outcomes (PROs) through interviews or conversations during clinical encounters provides insightful information about survivorship. OBJECTIVE This study aims to test the validity of natural language processing (NLP) and machine learning (ML) algorithms in identifying different attributes of pain interference and fatigue symptoms experienced by child and adolescent survivors of cancer versus the judgment by PRO content experts as the gold standard to validate NLP/ML algorithms. METHODS This cross-sectional study focused on child and adolescent survivors of cancer, aged 8 to 17 years, and caregivers, from whom 391 meaning units in the pain interference domain and 423 in the fatigue domain were generated for analyses. Data were collected from the After Completion of Therapy Clinic at St. Jude Children’s Research Hospital. Experienced pain interference and fatigue symptoms were reported through in-depth interviews. After verbatim transcription, analyzable sentences (ie, meaning units) were semantically labeled by 2 content experts for each attribute (physical, cognitive, social, or unclassified). Two NLP/ML methods were used to extract and validate the semantic features: bidirectional encoder representations from transformers (BERT) and Word2vec plus one of the ML methods, the support vector machine or extreme gradient boosting. Receiver operating characteristic and precision-recall curves were used to evaluate the accuracy and validity of the NLP/ML methods. RESULTS Compared with Word2vec/support vector machine and Word2vec/extreme gradient boosting, BERT demonstrated higher accuracy in both symptom domains, with 0.931 (95% CI 0.905-0.957) and 0.916 (95% CI 0.887-0.941) for problems with cognitive and social attributes on pain interference, respectively, and 0.929 (95% CI 0.903-0.953) and 0.917 (95% CI 0.891-0.943) for problems with cognitive and social attributes on fatigue, respectively. In addition, BERT yielded superior areas under the receiver operating characteristic curve for cognitive attributes on pain interference and fatigue domains (0.923, 95% CI 0.879-0.997; 0.948, 95% CI 0.922-0.979) and superior areas under the precision-recall curve for cognitive attributes on pain interference and fatigue domains (0.818, 95% CI 0.735-0.917; 0.855, 95% CI 0.791-0.930). CONCLUSIONS The BERT method performed better than the other methods. As an alternative to using standard PRO surveys, collecting unstructured PROs via interviews or conversations during clinical encounters and applying NLP/ML methods can facilitate PRO assessment in child and adolescent cancer survivors.
Article
Full-text available
Patient-reported (PRO) and clinician-reported (CRO) outcomes are assessment instruments that are completed by patients and trained healthcare professionals, respectively. A PRO is a report of the direct experience of the patient with a given disease condition. A CRO is an assessment of the condition of the patient by the healthcare provider. PROs may not be accessible to all patients, especially those suffering from severe disease conditions. CROs are time-consuming and therefore administered infrequently. In the present study, we introduce a new form of assessment, the digital-reported outcome (DRO), which is automatically derived from the medical notes of the patient. DROs have a low overhead and can be generated at each patient’s visit to complement other outcome-assessment instruments and enhance clinical decision support by identifying at-risk patients. In this study, a DRO is developed to evaluate the functional impairment in the daily activities of two cohorts of patients suffering from bipolar disorder and schizophrenia. The input of the DRO is a single medical note from the electronic medical record of the patient. This note is submitted to a hierarchical bidirectional encoder representations from transformers (BERT) model. First, a sentence-level embedding is produced for each sentence in the note using a token-level attention mechanism. Second, an embedding for the entire note is constructed using a sentence-level attention mechanism. Third, the final embedding is classified using a feed-forward neural network. The model is trained to classify patients into moderate or severe functioning impairment levels according to the general assessment of functioning (GAF) scale, a CRO instrument for the assessment of the impact of mental illness on the daily activities of the patient. The DRO is validated using medical notes that were labeled by multiple healthcare providers from different healthcare institutions. The results indicate that a general DRO is able to classify patients from the two cohorts according to the two functioning impairment levels (severe versus moderate) prior to the onset of disease with an AUC of 76%. Disease-specific DROs are only applicable after the onset of the disease and produced AUCs of nearly 85%. The methodology introduced in the present paper is practical and can support the automated monitoring of the severity of the functioning impairment of bipolar and schizophrenia patients. Extending the proposed DRO to other psychiatric conditions and types of impairments is the subject of ongoing research work.
Article
Background: Most previous studies make psychiatric diagnoses based on diagnostic terms. In this study we sought to augment Diagnostic and Statistical Manual of Mental Disorders, 5th Edition (DSM-5) diagnostic criteria with deep neural network models to make psychiatric diagnoses based on psychiatric notes. Methods: We augmented DSM-5 diagnostic criteria with self-attention-based bidirectional long short-term memory (BiLSTM) models to identify schizophrenia, bipolar, and unipolar depressive disorders. Given that the diagnostic criteria for psychiatric diagnosis include a certain symptom profile and functional impairment, we first extracted psychiatric symptoms and functional features with two approaches, including a lexicon-based approach and a dependency parsing approach. Then, we incorporated free-text discharge notes and extracted features for psychiatric diagnoses with the proposed models. Results: The micro-averaged F1 scores of the two automatic annotation approaches were greater than 0.8. BiLSTM models with self-attention outperformed the rule-based models with DSM-5 criteria in the prediction of schizophrenia and bipolar disorder, while the latter outperformed the former in predicting unipolar depressive disorder. Approaches for augmenting DSM-5 criteria with a self-attention-based BiLSTM outperformed both pure rule-based and pure deep neural network models. In terms of classification of psychiatric diagnoses, we observed that the performance for schizophrenia and bipolar disorder was acceptable. Conclusion: This DSM-5-augmented deep neural network models showed good performance in identifying psychiatric diagnoses from psychiatric notes. We conclude that it is possible to establish a model that consults clinical notes to make psychiatric diagnoses comparably to physicians. Further research will be extended to outpatient notes and other psychiatric disorders.
Article
Full-text available
Introduction Psychiatric disorders are diagnosed through observations of psychiatrists according to diagnostic criteria such as the DSM-5. Such observations, however, are mainly based on each psychiatrist's level of experience and often lack objectivity, potentially leading to disagreements among psychiatrists. In contrast, specific linguistic features can be observed in some psychiatric disorders, such as a loosening of associations in schizophrenia. Some studies explored biomarkers, but biomarkers have yet to be used in clinical practice. Aim The purposes of this study are to create a large dataset of Japanese speech data labeled with detailed information on psychiatric disorders and neurocognitive disorders to quantify the linguistic features of those disorders using natural language processing and, finally, to develop objective and easy-to-use biomarkers for diagnosing and assessing the severity of them. Methods This study will have a multi-center prospective design. The DSM-5 or ICD-11 criteria for major depressive disorder, bipolar disorder, schizophrenia, and anxiety disorder and for major and minor neurocognitive disorders will be regarded as the inclusion criteria for the psychiatric disorder samples. For the healthy subjects, the absence of a history of psychiatric disorders will be confirmed using the Mini-International Neuropsychiatric Interview (M.I.N.I.). The absence of current cognitive decline will be confirmed using the Mini-Mental State Examination (MMSE). A psychiatrist or psychologist will conduct 30-to-60-min interviews with each participant; these interviews will include free conversation, picture-description task, and story-telling task, all of which will be recorded using a microphone headset. In addition, the severity of disorders will be assessed using clinical rating scales. Data will be collected from each participant at least twice during the study period and up to a maximum of five times at an interval of at least one month. Discussion This study is unique in its large sample size and the novelty of its method, and has potential for applications in many fields. We have some challenges regarding inter-rater reliability and the linguistic peculiarities of Japanese. As of September 2022, we have collected a total of >1000 records from >400 participants. To the best of our knowledge, this data sample is one of the largest in this field. Clinical Trial Registration Identifier: UMIN000032141.
Article
Full-text available
Purpose The South London and Maudsley National Health Service (NHS) Foundation Trust Biomedical Research Centre (SLaM BRC) Case Register and its Clinical Record Interactive Search (CRIS) application were developed in 2008, generating a research repository of real-time, anonymised, structured and open-text data derived from the electronic health record system used by SLaM, a large mental healthcare provider in southeast London. In this paper, we update this register's descriptive data, and describe the substantial expansion and extension of the data resource since its original development. Participants Descriptive data were generated from the SLaM BRC Case Register on 31 December 2014. Currently, there are over 250 000 patient records accessed through CRIS. Findings to date Since 2008, the most significant developments in the SLaM BRC Case Register have been the introduction of natural language processing to extract structured data from open-text fields, linkages to external sources of data, and the addition of a parallel relational database (Structured Query Language) output. Natural language processing applications to date have brought in new and hitherto inaccessible data on cognitive function, education, social care receipt, smoking, diagnostic statements and pharmacotherapy. In addition, through external data linkages, large volumes of supplementary information have been accessed on mortality, hospital attendances and cancer registrations. Future plans Coupled with robust data security and governance structures, electronic health records provide potentially transformative information on mental disorders and outcomes in routine clinical care. The SLaM BRC Case Register continues to grow as a database, with approximately 20 000 new cases added each year, in addition to extension of follow-up for existing cases. Data linkages and natural language processing present important opportunities to enhance this type of research resource further, achieving both volume and depth of data. However, research projects still need to be carefully tailored, so that they take into account the nature and quality of the source information.
Article
Full-text available
Objective Our primary objective was to identify cognitive behavioural therapy (CBT) delivery for people with psychosis (CBTp) using an automated method in a large electronic health record database. We also examined what proportion of service users with a diagnosis of psychosis were recorded as having received CBTp within their episode of care during defined time periods provided by early intervention or promoting recovery community services for people with psychosis, compared with published audits and whether demographic characteristics differentially predicted the receipt of CBTp. Methods Both free text using natural language processing (NLP) techniques and structured methods of identifying CBTp were combined and evaluated for positive predictive value (PPV) and sensitivity. Using inclusion criteria from two published audits, we identified anonymised cross-sectional samples of 2579 and 2308 service users respectively with a case note diagnosis of schizophrenia or psychosis for further analysis. Results The method achieved PPV of 95% and sensitivity of 96%. Using the National Audit of Schizophrenia 2 criteria, 34.6% service users were identified as ever having received at least one session and 26.4% at least two sessions of CBTp; these are higher percentages than previously reported by manual audit of a sample from the same trust that returned 20.0%. In the fully adjusted analysis, CBTp receipt was significantly (p<0.05) more likely in younger patients, in white and other when compared with black ethnic groups and patients with a diagnosis of other schizophrenia spectrum and schizoaffective disorder when compared with schizophrenia. Conclusions The methods presented here provided a potential method for evaluating delivery of CBTp on a large scale, providing more scope for routine monitoring, cross-site comparisons and the promotion of equitable access.
Article
Full-text available
Purpose The South London and Maudsley National Health Service (NHS) Foundation Trust Biomedical Research Centre (SLaM BRC) Case Register and its Clinical Record Interactive Search (CRIS) application were developed in 2008, generating a research repository of real-time, anonymised, structured and open-text data derived from the electronic health record system used by SLaM, a large mental healthcare provider in southeast London. In this paper, we update this register's descriptive data, and describe the substantial expansion and extension of the data resource since its original development. Participants Descriptive data were generated from the SLaM BRC Case Register on 31 December 2014. Currently, there are over 250 000 patient records accessed through CRIS. Findings to date Since 2008, the most significant developments in the SLaM BRC Case Register have been the introduction of natural language processing to extract structured data from open-text fields, linkages to external sources of data, and the addition of a parallel relational database (Structured Query Language) output. Natural language processing applications to date have brought in new and hitherto inaccessible data on cognitive function, education, social care receipt, smoking, diagnostic statements and pharmacotherapy. In addition, through external data linkages, large volumes of supplementary information have been accessed on mortality, hospital attendances and cancer registrations. Future plans Coupled with robust data security and governance structures, electronic health records provide potentially transformative information on mental disorders and outcomes in routine clinical care. The SLaM BRC Case Register continues to grow as a database, with approximately 20 000 new cases added each year, in addition to extension of follow-up for existing cases. Data linkages and natural language processing present important opportunities to enhance this type of research resource further, achieving both volume and depth of data. However, research projects still need to be carefully tailored, so that they take into account the nature and quality of the source information.
Article
Full-text available
Objectives: Electronic healthcare records (EHRs) are a rich source of information, with huge potential for secondary research use. The aim of this study was to develop an application to identify instances of Adverse Drug Events (ADEs) from free text psychiatric EHRs. Methods: We used the GATE Natural Language Processing (NLP) software to mine instances of ADEs from free text content within the Clinical Record Interactive Search (CRIS) system, a de-identified psychiatric case register developed at the South London and Maudsley NHS Foundation Trust, UK. The tool was built around a set of four movement disorders (extrapyramidal side effects [EPSEs]) related to antipsychotic therapy and rules were then generalised such that the tool could be applied to additional ADEs. We report the frequencies of recorded EPSEs in patients diagnosed with a Severe Mental Illness (SMI) and then report performance in identifying eight other unrelated ADEs. Results: The tool identified EPSEs with >0.85 precision and >0.86 recall during testing. Akathisia was found to be the most prevalent EPSE overall and occurred in the Asian ethnic group with a frequency of 8.13%. The tool performed well when applied to most of the non-EPSEs but least well when applied to rare conditions such as myocarditis, a condition that appears frequently in the text as a side effect warning to patients. Conclusions: The developed tool allows us to accurately identify instances of a potential ADE from psychiatric EHRs. As such, we were able to study the prevalence of ADEs within subgroups of patients stratified by SMI diagnosis, gender, age and ethnicity. In addition we demonstrated the generalisability of the application to other ADE types by producing a high precision rate on a non-EPSE related set of ADE containing documents. Availability: The application can be found at http://git.brc.iop.kcl.ac.uk/rmallah/dystoniaml.
Article
Full-text available
Antipsychotic prescription information is commonly derived from structured fields in clinical health records. However, utilising diverse and comprehensive sources of information is especially important when investigating less frequent patterns of medication prescribing such as antipsychotic polypharmacy (APP). This study describes and evaluates a novel method of extracting APP data from both structured and free-text fields in electronic health records (EHRs), and its use for research purposes. Using anonymised EHRs, we identified a cohort of patients with serious mental illness (SMI) who were treated in South London and Maudsley NHS Foundation Trust mental health care services between 1 January and 30 June 2012. Information about antipsychotic co-prescribing was extracted using a combination of natural language processing and a bespoke algorithm. The validity of the data derived through this process was assessed against a manually coded gold standard to establish precision and recall. Lastly, we estimated the prevalence and patterns of antipsychotic polypharmacy. Individual instances of antipsychotic prescribing were detected with high precision (0.94 to 0.97) and moderate recall (0.57-0.77). We detected baseline APP (two or more antipsychotics prescribed in any 6-week window) with 0.92 precision and 0.74 recall and long-term APP (antipsychotic co-prescribing for 6 months) with 0.94 precision and 0.60 recall. Of the 7,201 SMI patients receiving active care during the observation period, 338 (4.7 %; 95 % CI 4.2-5.2) were identified as receiving long-term APP. Two second generation antipsychotics (64.8 %); and first -second generation antipsychotics were most commonly co-prescribed (32.5 %). These results suggest that this is a potentially practical tool for identifying polypharmacy from mental health EHRs on a large scale. Furthermore, extracted data can be used to allow researchers to characterize patterns of polypharmacy over time including different drug combinations, trends in polypharmacy prescribing, predictors of polypharmacy prescribing and the impact of polypharmacy on patient outcomes.
Article
Several studies have demonstrated the reliability and validity of the Yale-Brown Obsessive Compulsive Scale (YBOCS) conducted by trained interviewers. The present study examined several aspects of a self-report YBOCS version relative to the usual interview format in two non-clinical samples (ns = 46 and 70) and in a clinical OCD sample (n = 36) and a clinical non-OCD group (n = 10). The self-rated instrument showed excellent internal consistency and test-retest reliability, performing somewhat better than the interview. There was good agreement between symptom checklist categories across the two versions, though clinical subjects reported more symptoms on the self-report form than on the interview. Some order effects were evident for non-clinical subjects only: those who received the self-report first scored lower on both self-report and interview than those who received the interview first. No order effects were observed in the clinical sample. The self-report version showed strong convergent validity with the interview, and discriminated well between OCD and non-OCD patients. Although more study is needed, particularly on clinical samples, these findings suggest that the self-report YBOCS may be a time-saving and less costly substitute for the interview format in assessing OCD symptoms.
Article
Clinical research is on the threshold of a new era in which electronic health records (EHRs) are gaining an important novel supporting role. While EHRs used for routine clinical care have some limitations at present, as discussed in this review, new improved systems and emerging research infrastructures are being developed to ensure that EHRs can be used for secondary purposes such as clinical research, including the design and execution of clinical trials for new medicines. EHR systems should be able to exchange information through the use of recently published international standards for their interoperability and clinically validated information structures (such as archetypes and international health terminologies), to ensure consistent and more complete recording and sharing of data for various patient groups. Such systems will counteract the obstacles of differing clinical languages and styles of documentation as well as the recognised incompleteness of routine records. Here we discuss some of the legal and ethical concerns of clinical research data reuse and technical security measures that can enable such research whilst protecting privacy. In the emerging research landscape, co-operation infrastructures are being built where research projects can utilise the availability of patient data from federated EHR systems from many different sites, as well as in international multi-lingual settings. Among several initiatives described, the EHR4CR project offers a promising method for clinical research. One of the first achievements of this project was the development of a protocol feasibility prototype which is used for finding patients eligible for clinical trials from multiple sources This article is protected by copyright. All rights reserved.
Medical data are, most of the times, very complex both in form and content. One of the greatest challenges for the IT community in healthcare is to enable the full utilization of these data by information systems. This explicit variety combined with the fact that data usually derives from diverse systems are great obstacles to this task. The result is that data stored in medical information systems usually do not accurately represent reality. In order to eliminate the fallacy between stored and real data, specialized applications that facilitate and accelerate data import into information systems must be developed. This is the goal of Natural Language Processing, the scientific field that combines computer science and linguistics. As a result NLP systems use applications for the coding and standardization of information, known as controlled medical vocabularies. The result of these processes is data that can be used by various technologies, such as clinical data warehouses and decision support systems, the functionality of which is fully dependable on the completeness and accuracy of the data on which their analysis is imposed.