Co-occurrence of the predicted labels. a The chord diagram of the model predicted labels among 1000 samples is shown. Most co-occurrence relationships match semantic word relationships in hematopathology. However, the model does not learn the exclusiveness of the label "normal''. It may be because we treated labels independently during model training. b The label "myelodysplastic syndrome" co-occurred often with the labels "acute myeloid leukemia" and "hypercellular". c The label "myeloproliferative neoplasm" tended to co-occur with the labels "chronic myeloid leukemia", "hypercellular", "basophilia", and "eosinophilia". d The model does not learn the exclusiveness of the label "normal". [An interactive web version can be accessed via https://storage.googleapis.com/pathopatho/chord.html]

Co-occurrence of the predicted labels. a The chord diagram of the model predicted labels among 1000 samples is shown. Most co-occurrence relationships match semantic word relationships in hematopathology. However, the model does not learn the exclusiveness of the label "normal''. It may be because we treated labels independently during model training. b The label "myelodysplastic syndrome" co-occurred often with the labels "acute myeloid leukemia" and "hypercellular". c The label "myeloproliferative neoplasm" tended to co-occur with the labels "chronic myeloid leukemia", "hypercellular", "basophilia", and "eosinophilia". d The model does not learn the exclusiveness of the label "normal". [An interactive web version can be accessed via https://storage.googleapis.com/pathopatho/chord.html]

Source publication
Article
Full-text available
Background Pathology synopses consist of semi-structured or unstructured text summarizing visual information by observing human tissue. Experts write and interpret these synopses with high domain-specific knowledge to extract tissue semantics and formulate a diagnosis in the context of ancillary testing and clinical information. The limited number...

Similar publications

Preprint
Full-text available
Legal proceedings take plenty of time and also cost a lot. The lawyers have to do a lot of work in order to identify the different sections of prior cases and statutes. The paper tries to solve the first tasks in AILA2021 (Artificial Intelligence for Legal Assistance) that will be held in FIRE2021 (Forum for Information Retrieval Evaluation). The t...
Article
Full-text available
A great deal of operational information exists in the form of text. Therefore, extracting operational information from unstructured military text is of great significance for assisting command decision making and operations. Military relation extraction is one of the main tasks of military information extraction, which aims at identifying the relat...
Article
Full-text available
Recent years, software defect prediction systems are becoming quite popular since they improve software reliability by identifying the potential bugs in the code. Several models were introduced in literature that aim to support the developers. Unfortunately, these models consider the manually constructed code features and input into machine learnin...

Citations

... 556 WSIs which have more than 256 cells that are in the label set of {Neutrophil, Metamyelocyte, Myelocyte, Promyelocyte, Blast, Erythroblast, Megakaryocyte nucleus, Lymphocyte, Monocyte, Plasma cell, Eosinophil, Basophil, Histiocyte, Mast cell } were selected. Labels were created by simplifying the predictions of a fine-tuned BERT model on WSI synopses, as previously reported by our group [46]. This BERT model's predictions took the form of a multi-label task. ...
... SupCon loss is also robust to noisy labels and more stable to hyperparameter settings like optimizers and data augmentations [59]. Because our labels come from our previously published BERT model's predictions [46] rather than experts, we must take its prediction errors into consideration. SupCon loss's tolerance to noisy labels became an obvious option for this training. ...
... In terms of the weaknesses of our system, WSI labels came from our previously published BERT model's predictions [46] rather than experts, which are not perfectly correct. This was implemented as it is not practically feasible to manually label many hundreds of WSI with keywords from semi-structured diagnostic synopses. ...
Preprint
Full-text available
One of the goals of AI-based computational pathology is to generate compact WSI representations, identifying the essential information required for diagnosis. While such approaches have been applied to histopathology, few applications have been reported in cytology. Bone marrow aspirate cytology is the basis for key clinical decisions in hematology. However, visual inspection of aspirate specimens is a tedious and complex process subject to variation in interpretation, and hematopathology expertise is scarce. The ability to generate a compact representation of an aspirate specimen may form the basis for clinical decision support tools in hematology. We have previously published an end-to-end AI-based system for counting and classifying cells from bone marrow aspirate WSI. Using deep embeddings from this model, we construct bags of individual cell features from each WSI, and apply multiple instance learning to extract vector representations for each WSI. Using these representations in vector search, we achieved 0.58 ± 0.02 mAP@10 in WSI-level image retrieval, which outperforms the Random baseline (0.39 ± 0.1). Using a weighted k-nearest-neighbours (k-NN) model on these slide vectors, we predict five broad diagnostic labels on individual aspirate WSI with a weighted-macro-average F1 score of 0.57 ± 0.03 on the test set of 278 randomly sampled WSIs, which outperforms a classifier using empirical class prior probabilities (0.26 ± 0.02). We present the first example of exploring trainable mechanisms to generate compact, slide-level representations in bone marrow cytology with deep learning. This method has the potential to summarize complex semantic information in WSIs toward improved diagnostics in hematology, and may eventually support AI-assisted computational pathology approaches.
... When the data is textual, the extraction of features is a bit different where the aim is to create word or text embeddings. Generally in medical NLP and at the feature extraction level, the BERT (as a state-of-the-art model) was used in several studies as in [14,15,16,17], whereas, the Word2Vec model was used in [18,19,20]. In contrast, and at the algorithmic level, various deep learning models were widely used in medical NLP. ...
Article
Full-text available
Automatic symptom identification plays a crucial role in assisting doctors during the diagnosis process in Telemedicine. In general, physicians spend considerable time in clinical documentation and symptom identification, which is unfeasible due to their full schedule. With text-based consultation services in telemedicine, the identification of symptoms from a user's consultation is a sophisticated process and time-consuming. Moreover, at Altibbi, which is an Arabic web-based telemedicine platform and the context of this work, users consult doctors and describe their conditions in different Arabic dialects which makes the problem more complex and challenging. Therefore, in this work an advanced deep learning approach is developed for automatic extraction of symptoms based on a massive amount of medical consultations with multi-dialects. The approach is formulated as a multi-label multi-class classification using features extracted based on AraBERT and fine-tuned on the bidirectional long short-term memory (BiLSTM) network. The Fine-tuning of BiLSTM relied on features engineered based on different variants of the bidirectional encoder representations from transformers (BERT). Evaluating the models based on precision, recall, and a customized hit rate showed a successful identification of symptoms from Arabic texts with a promising accuracy. Hence, this paves the way toward deploying an automated symptom identification model in production at Altibbi which can help general practitioners in telemedicine in providing more efficient and accurate consultations.