Background: The structure of psychopathology determines how we identify people who need support services and how we can best help them. Currently, we identify those with psychopathological issues via assessments based on diagnostic manuals, such as the International Statistical Classification of Diseases (ICD) and Diagnostic and Statistical Manual of Mental Disorders (DSM). However, there is a growing literature that has raised serious concerns about these two manuals. Some have suggested that such diagnostic manuals have misguided decades of mental health studies and may have contributed to dissatisfaction among service seekers and users relating to ineffective treatments and negative experiences with service providers. This doctoral dissertation explores possible alternative approaches to our understanding of the structure of psychopathology. It considers how these approaches could contribute to future classification, diagnostic and service delivery systems.
Method: We used one dataset for all four studies. It was mined from https://www.livejournal.com/ and consisted of narratives about lived experiences from people diagnosed with mental disorders. The data were analysed using Jaccard’s Coefficient to find similarity between diagnostic categories (study 1), K-Means Clustering to group symptoms into diagnostic categories (in study 2), Network Analysis to find the relationships between the co-occurring symptoms and Eigenvector Centrality to estimate which among them are co-occurring with most other symptoms (study 3), and standard correlation to find the strength of such associations (in study 4).
Findings: We proposed an alternative approach for estimating the reliability of the existing system (study 1) to study the extent of diagnostic overlap (heterogeneity) because the present studies evaluating the reliability had their limitations. Study 1 (chapter 4) contributes to the literature by being the first study to exploit patient narrative data, using innovative text-mining methods in this context, to assess the diagnostic heterogeneity of the DSM categories. It provides unique evidence to reinforce existing studies of diagnostic heterogeneity using alternative approaches such as Jaccard’s coefficients. Once verified that the diagnostic heterogeneity of human-led traditional diagnostic categories is too large for practical usage, we searched for the reasons. Many studies have attributed the problem to the committee members who created the manuals. Among the several raised questions, the committee members reported a financial conflict of interests with the industry and relied more on consensus than data. So, eliminating the human component of decision making, we should be able to find homogeneous groups of disorders. Therefore, we attempted to create categories of mental illnesses using Artificial Intelligence (study 2) from patients’ reported symptoms. Study 2 (chapter 5) contributes to the literature by being the first study in this context to demonstrate how to cluster the patients using artificial intelligence based on the similarities in their reported symptoms or experiences from their illness narratives. It provides evidence to contrast the conventional idea of conceptualising “mental illnesses as categories” using unsupervised machine learning algorithms and the silhouette score elbow method. For example, in study 2, when the machine-driven approach also produced mental disorder categories with high heterogeneity, we inferred that while there might have been human biases with the traditional diagnostic manuals, the more important point is that the categorical approach is not the way forward. The findings from study 2 support the literature and state the same. The literature has proposed several alternatives, such as the dimensional and network approaches. But related to this notion of diagnosing and studying humans (and their conditions) as categories, such as depression, consisting of individual entities (e.g., symptoms), there is another serious problem with the mental health research culture - that has found its way into these new alternatives as well. This problem is related to using total scores of survey items as objects of inquiry (e.g., total depression score). This approach assumes that all the items in the questionnaire (e.g., low mood, lack of interest) contributes in equal proportions to the construct (e.g., depression), but the empirical evidence suggests otherwise. The newer dimensional approaches such as the HiTOP relies on such sum scores. Likewise, some network studies are also using such sum scores. Therefore, in doing so, such alternatives risk carrying forward some of the weaknesses of its categorical predecessors. As an alternative, we proposed the use of individual symptoms as an object of inquiry. It’s a relatively novel approach, and we hoped to advance the literature.
Therefore, we created a network of psychopathological symptoms based on patients’ reports (study 3). Study 3 (chapter 6) contributes to the literature by being the first study to demonstrate how to create network graphs from pure narrative data from patients in this context and presented a new approach for exploratory analysis by finding inter-relations in their reported symptoms or experiences from patients’ illness narratives. It demonstrates a relatively novel approach to focus on individual symptoms for the object of inquiry instead of categories of mental disorders or sum-scores of scales or questionnaires. The study discovered relationships based on co-occurrences of the reported symptoms. Still, it did not communicate the strength (“numeric” degree) of such association. While finding the association has merit for preliminary exploration, for this approach of using individual symptoms as an object of inquiry to be useful for clinical and research purposes, we argue that it must provide the information related to the strength of association. So, in the final study, we attempted to find the correlations of auditory hallucination and, in doing so, demonstrated how to find correlation coefficients between pairs of symptoms from a qualitative (text-based narrative) dataset.
Furthermore, the correlations were valuable to the advancement of the theoretical literature of auditory hallucination. Study 4 (chapter 7) contributes to the literature by being the first study to demonstrate how to do correlation analysis on qualitative data in this context. It suggests a new direction of conducting exploratory research using rich qualitative datasets and standard statistical methods without the limitations of a conventional survey dataset.
Conclusion: The doctoral thesis found that the traditional categorical approach does not accurately reflect the complexity of people’s experiences. There might be human biases and conflict of interest, which might have influenced the creation of the diagnostic manuals. Still, even when artificial intelligence attempted to find similar patterns within the patients’ experiences, it could not indicate that psychopathological experiences cannot be categorised into homogenous groups. So, we argue that the future of mental health literature should divorce itself from using DSM and ICD categories of mental disorders as the object of investigations and as the framework for conceptualising mental illnesses. Instead, we argue that the focus should be on alternative conceptualisations of psychopathology, such as the network model of psychopathology, which focuses on the individual symptoms and the inter-relationships between them. Our preliminary network model explores the specific relationships between symptoms found that were frequently occurring but relatively less studied in the literature - opening up newer lines of investigation for future studies to build upon.
Furthermore, using auditory hallucination as an object of investigation, we found the variables with the highest correlation coefficients and attempted to advance the psychosis literature. One major merit and contribution of the doctoral thesis is to demonstrate how we can do all that was mentioned above using rich qualitative data. Unlike survey data, the current data did not pose any restrictions in terms of the number or type of variables being reported. The respondents reported everything that had to report. Additionally, the thesis demonstrated how a large volume of qualitative data could be obtained and then analysed using statistical and machine learning-based approaches with minimum effort and time using advanced technologies such as Natural Language Processing, Artificial Intelligence, and Web-Scraping technologies. This thesis's second major merit and contribution are to demonstrate how to use novel data analytic procedures such as Jaccard’s Coefficient, K-Means Clustering and Network Graphs, and conventional statistics such as correlation coefficients on such qualitative datasets. No manual analysis, such as thematic analysis of the qualitative data, was done. This thesis's third merit and contribution were in terms of advancing the literature by evaluating diagnostic heterogeneity between categories of mental disorders using a novel approach (study 1); finding out symptoms that exclusive to each cluster of mental disorders (study 2); estimating the tendency of specific symptoms to co-occur with other symptoms (study 3), and finding out the symptoms associated with auditory hallucination (study 4). Future mental health studies will benefit from this contribution and are expected to produce deeper insight into mental conditions and treatment of mental ill-health.