Missing data imputation is an important task in cases where it is crucial to use all available data and not discard records with missing values. This work evaluates the performance of several statistical and machine learning imputation methods that were used to predict recurrence in patients in an extensive real breast cancer data set.
Imputation methods based on statistical techniques, e.g., mean, hot-deck and multiple imputation, and machine learning techniques, e.g., multi-layer perceptron (MLP), self-organisation maps (SOM) and k-nearest neighbour (KNN), were applied to data collected through the "El Álamo-I" project, and the results were then compared to those obtained from the listwise deletion (LD) imputation method. The database includes demographic, therapeutic and recurrence-survival information from 3679 women with operable invasive breast cancer diagnosed in 32 different hospitals belonging to the Spanish Breast Cancer Research Group (GEICAM). The accuracies of predictions on early cancer relapse were measured using artificial neural networks (ANNs), in which different ANNs were estimated using the data sets with imputed missing values.
The imputation methods based on machine learning algorithms outperformed imputation statistical methods in the prediction of patient outcome. Friedman's test revealed a significant difference (p=0.0091) in the observed area under the ROC curve (AUC) values, and the pairwise comparison test showed that the AUCs for MLP, KNN and SOM were significantly higher (p=0.0053, p=0.0048 and p=0.0071, respectively) than the AUC from the LD-based prognosis model.
The methods based on machine learning techniques were the most suited for the imputation of missing values and led to a significant enhancement of prognosis accuracy compared to imputation methods based on statistical procedures.
The process of patient care performed by an anaesthesiologist during high invasive surgery requires fundamental knowledge of the physiologic processes and a long standing experience in patient management to cope with the inter-individual variability of the patients. Biomedical engineering research improves the patient monitoring task by providing technical devices to measure a large number of a patient's vital parameters. These measurements improve the safety of the patient during the surgical procedure, because pathological states can be recognised earlier, but may also lead to an increased cognitive load of the physician. In order to reduce cognitive strain and to support intra-operative monitoring for the anaesthesiologist an intelligent patient monitoring and alarm system has been proposed and implemented which evaluates a patient's haemodynamic state on the basis of a current vital parameter constellation with a knowledge-based approach. In this paper general design aspects and evaluation of the intelligent patient monitoring and alarm system in the operating theatre are described. The validation of the inference engine of the intelligent patient monitoring and alarm system was performed in two steps. Firstly, the knowledge base was validated with real patient data which was acquired online in the operating theatre. Secondly, a research prototype of the whole system was implemented in the operating theatre. In the first step, the anaesthetists were asked to enter a state variable evaluation before a drug application or any other intervention on the patient into a recording system. These state variable evaluations were compared to those generated by the intelligent alarm system on the same vital parameter constellations. Altogether 641 state variable evaluations were entered by six different physicians. In total, the sensitivity of alarm recognition is 99.3%, the specificity is 66% and the predictability is 45%. The second step was performed using a research prototype of the system in anaesthesiological routine. The evaluation of 684 events yielded a sensitivity, specificity and predictability of the alarm recognition of more than 99%.
This paper presents a conceptual model that is developed upon a characterization of human papillomavirus type 16 (HPV16) which is used to build a simulation prototype of the HPV16 growth process.
The human papillomavirus type 16 is the principal virus detected in invasive lesions of cervical cancer, and associated with the greater persistence and prevalence in pre-malignant and malignant lesions. The probability of acquiring an infection with HPV16 is extremely high in sexually active individuals. However, an HPV16 infection can disappear after becoming a histological confirmed case. According to the characterization of HPV16 proposed in this paper, cells as compared to a society behaves as a complex system, i.e., cells behave in a cooperative manner, following a set of rules defined by local interactions among them. Such complex system is defined by combining a cellular automaton and agent-based models. In this way, the behavior of the HPV16 is simulated by allowing the cellular automaton to follow such parameterized behavior rules.
Both cross-sectional and prospective studies indicate that HPV16 infection persistence increase the risk of high-grade CIN, as observed in the results provided by the growth simulation model of HPV16. The average growth rate extrapolated over 52 weeks (12 months) and calculated by the model showed a 37.87% growth for CIN1, 35.53% for CIN2 and 16.92% for CIN3. Remarkably, these results are similar to the results obtained and reported by clinical studies. For example, the results obtained using the proposed model for CIN2 and the results obtained by Östör , have a differential of 0.53 percentage points while have a differential of 2.23 percentage points with the results obtained by Insinga et al. . Also, for the CIN3, the results obtained using the proposed model, have a differential of 2.92 percentage points with the Insinga et al. , results.
Through the specification of parameterized behavior rules for HPV16 that are simulated under the combined technique of cellular automata and agent-based models, the HPV life cycle can be simulated allowing for observations at different stages. The proposed model then can be used as a support tool in the investigation of HPV16, in particular (as part of our future work) to develop drugs as agents in the control of the HPV16 disease.
Exploiting the information technology may have a great impact on improving cooperation and interoperability among the different professionals taking part to the process of delivering health care services. New paradigms are therefore being devised considering software systems as autonomous agents able to help professionals in accomplishing their duties. To this aim those systems should encapsulate the skills for solving a given set of tasks and possess the social ability to cooperate in order to fetch the required information and knowledge. This paper illustrates a methodology facilitating the development of interoperable intelligent software agents for medical applications and proposes a generic computational model for implementing them. That model may be specialized in order to support all the different information and knowledge related requirements of a Hospital Information System. The architecture is being tested for implementing a prototype system able to coordinate the joint efforts of the professionals involved in managing patients affected by Acute Myeloid Leukemia.
Our goal is to propose and solve a new formulation of the recently-formalized patient admission scheduling problem, extending it by including several real-world features, such as the presence of emergency patients, uncertainty in the length of stay, and the possibility of delayed admissions.
We devised a metaheuristic approach that solves both the static (predictive) and the dynamic (daily) versions of this new problem, which is based on simulated annealing and a complex neighborhood structure.
The quality of our metaheuristic approach is compared with an exact method based on integer linear programming. The main outcome is that our method is able to solve large cases (up to 4000 patients) in a reasonable time, whereas the exact method can solve only small/medium-size instances (up to 250 patients). For such datasets, the two methods obtain results at the same level of quality. In addition, the gap between our (dynamic) solver and the static one, which has all information available in advance, is only 4-5%. Finally, we propose (and publish on the web) a large set of new instances, and we discuss the impact of their features in the solution process.
The metaheuristic approach proved to be a valid search method to solve dynamic problems in the healthcare domain.
Therapy planning plays an increasingly important role in the everyday work of physicians. Clinical protocols or guidelines are typically represented using flow-charts, decision tables, or plain text. These representations are badly suited, however, for complex medical procedures.One representation method that overcomes these problems is the language Asbru. But because Asbru has a LISP-like syntax (and also incorporates many concepts from computer science), it is not suitable for physicians.Therefore, we developed a visualization and user interface to deal with treatment plans expressed in Asbru. We use graphical metaphors to make the underlying concepts easier to grasp, employ glyphs to communicate complex temporal information and colors to make it possible to understand the connection between the two views (Topological View and Temporal View) available in the system. In this paper, we present the design ideas behind AsbruView, and discuss its usefulness based on the results of a usability study we performed with six physicians.
Objective: This paper describes a methodology which enables computer-aided support for the planning, visualization and execution of personalized patient treatments in a specific healthcare process, taking into account complex temporal constraints and ...
This article delineates a relatively unknown path in the history of medical philosophy and medical diagnosis. It is concerned with the phenomenon of vagueness in the physician's "style of thinking" and with the use of fuzzy sets, systems, and relations with a view to create a model of such reasoning when physicians make a diagnosis. It represents specific features of medical ways of thinking that were mentioned by the Polish physician and philosopher Ludwik Fleck in 1926. The paper links Lotfi Zadeh's work on system theory before the age of fuzzy sets with system-theory concepts in medical philosophy that were introduced by the philosopher Mario Bunge, and with the fuzzy-theoretical analysis of the notions of health, illness, and disease by the Iranian-German physician and philosopher Kazem Sadegh-Zadeh.
Some proposals to apply fuzzy sets in medicine were based on a suggestion made by Zadeh: symptoms and diseases are fuzzy in nature and fuzzy sets are feasible to represent these entity classes of medical knowledge. Yet other attempts to use fuzzy sets in medicine were self-contained. The use of this approach contributed to medical decision-making and the development of computer-assisted diagnosis in medicine.
With regard to medical philosophy, decision-making, and diagnosis; the framework of fuzzy sets, systems, and relations is very useful to deal with the absence of sharp boundaries of the sets of symptoms, diagnoses, and phenomena of diseases. The foundations of reasoning and computer assistance in medicine were the result of a rapid accumulation of data from medical research. This explosion of knowledge in medicine gave rise to the speculation that computers could be used for the medical diagnosis. Medicine became, to a certain extent, a quantitative science. In the second half of the 20th century medical knowledge started to be stored in computer systems. To assist physicians in medical decision-making and patient care, medical expert systems using the theory of fuzzy sets and relations (such as the Viennese "fuzzy version" of the Computer-Assisted Diagnostic System, CADIAG, which was developed at the end of the 1970s) were constructed. The development of fuzzy relations in medicine and their application in computer-assisted diagnosis show that this fuzzy approach is a framework to deal with the "fuzzy mode of thinking" in medicine.
Diagnosis based on medical image data is common in medical decision making and clinical routine. We discuss a strategy to derive a classifier with good performance on clinical image data and to justify the properties of the classifier by an adapted simulation model of image data. We focus on the problem of classifying eyes as normal or glaucomatous based on 62 routine explanatory variables derived from laser scanning images of the optic nerve head. As learning sample we use a case-control study of 98 normal and 98 glaucomatous subjects matched by age and sex. Aggregating multiple unstable classifiers allows substantial reduction of misclassification error in many applications and bench mark problems. We investigate the performance of various classifiers for the clinical learning sample as well as for a simulation model of eye morphologies. Bagged classification trees (bagged-CTREE) are compared to single classification trees and linear discriminant analysis (LDA). We additionally compare three estimators of misclassification error: 10-fold cross-validation, the 0.632+ bootstrap and the out-of-bag estimate. In summary, the application of our strategy of a knowledge-based decision support shows that bagged classification trees perform best for glaucoma classification.
The local fuzzy fractal dimension (LFFD) is proposed to extract local fractal feature of medical images. The definition of LFFD is an extension of the pixel-covering method by incorporating the fuzzy set. Multi-feature edge detection is implemented with the LFFD and the Sobel operator. The LFFD can also serve as a characteristic of motion in medical image sequences. The experimental results show that the LFFD is an important feature of edge areas in medical images and can provide information for segmentation of echocardiogram image sequences.
In this paper a new approach to tomographic image reconstruction from projections is developed and investigated.
To solve the reconstruction problem a special neural network which resembles a Hopfield net is proposed. The reconstruction process is performed during the minimizing of the energy function in this network. To improve the performance of the reconstruction process an entropy term is incorporated into energy expression.
The approach presented in this paper significantly decreases the complexity of the reconstruction problem.
Two-dimensional electrophoresis (2DE) is a separation technique that can identify target proteins existing in a tissue. Its result is represented by a gel image that displays an individual protein in a tissue as a spot. However, because the technique suffers from low reproducibility, a user should manually annotate landmark spots on each gel image to analyze the spots of different images together. This operation is an error-prone and tedious job. For this reason, this paper proposes a method of extracting landmark spots automatically by using a data mining technique.
A landmark profile which summarizes the characteristics of landmark spots in a set of training gel images of the same tissue is generated by extracting the common properties of the landmark spots. On the basis of the landmark profile, candidate landmark spots in a new gel image of the same tissue are identified, and final landmark spots are determined by the well-known A* search algorithm.
The performance of the proposed method is analyzed through a series of experiments in order to identify its various characteristics.
This paper describes several concepts and metrics that may be used to assess various aspects of the quality of neural net classifiers. Each concept describes a property that may be taken into account by both designers and users of neural net classifiers when assessing their utility. Besides metrics for assessment of the correctness of classifiers we also introduce metrics that address certain aspects of the misclassifications. We show the applicability of the introduced quality concepts for selection among several neural net classifiers in the domain of thyroid disorders.
The study described in this paper concerns natural object modeling in the context of uncertain, imprecise and inconsistent representation. We propose a fuzzy system which offers a global modeling of object properties such as color, shape, velocity, etc. This modeling makes a transition from a low level reasoning (pixel level), which implies a local precise but uncertain representation, to a high level reasoning (region level), inducing a certain assignment. So, we use fuzzy structured partitions characterizing these properties. At this level. each property will have its own global modeling. Then, these different models are merged for decision making. Our approach was tested with several applications. In particular, we show here its performance in the area of blood flow analysis from 3D color Doppler images in order to quantify and study the development of this flow. We present methods that detect and correct aliasing phenomenon, i.e. inconsistent information. At first, the flow space is partitioned into fuzzy sectors where each sector is defined by a center, an angle and a direction. In parallel, the velocity information carried by the pixels is classified into fuzzy classes. Then, by combining these two partitions, we obtain the velocity distribution into sectors. Moreover, for each found path (from the first sector to the last one), we locate and correct inconsistent velocities by applying global rules. After extracting some meaningful sector features, the fuzzy modeling, applied to the aliasing correction, makes it possible to simplify and synthesize the blood flow direction.
The objective of this paper is to classify 3D medical images by analyzing spatial distributions to model and characterize the arrangement of the regions of interest (ROIs) in 3D space.
Two methods are proposed for facilitating such classification. The first method uses measures of similarity, such as the Mahalanobis distance and the Kullback-Leibler (KL) divergence, to compute the difference between spatial probability distributions of ROIs in an image of a new subject and each of the considered classes represented by historical data (e.g., normal versus disease class). A new subject is predicted to belong to the class corresponding to the most similar dataset. The second method employs the maximum likelihood (ML) principle to predict the class that most likely produced the dataset of the new subject.
The proposed methods have been experimentally evaluated on three datasets: synthetic data (mixtures of Gaussian distributions), realistic lesion-deficit data (generated by a simulator conforming to a clinical study), and functional MRI activation data obtained from a study designed to explore neuroanatomical correlates of semantic processing in Alzheimer's disease (AD).
Performed experiments demonstrated that the approaches based on the KL divergence and the ML method provide superior accuracy compared to the Mahalanobis distance. The later technique could still be a method of choice when the distributions differ significantly, since it is faster and less complex. The obtained classification accuracy with errors smaller than 1% supports that useful diagnosis assistance could be achieved assuming sufficiently informative historic data and sufficient information on the new subject.
This paper discusses an approach towards Knowledge-Based Systems (KBS) development which emphazises fit in the clinical organization, utility and safety. KBS design is identified as a subset of Decision Support System (DSS) design, and experiences from use of development methods is made available from the more general field. Generic and specific issues related to making KBS development result in systems used in clinical practice are discussed, and an integration of the Logic Engineering KBS technique into the Action Design DSS requirements specification method is outlined. Group-based knowledge modeling is identified as the bridge between the methods. It is concluded that while Action Design provides organizational validation (Are we building the right system?), Logic Engineering adds on KBS design verification (Are we going to build the system right?).
Artificial intelligence in medicine (AIM) has reached a period of adolescence in which interactions with the outside world are not only natural but mandatory. Although the basic research topics in AIM may be those of artificial intelligence, the applied issues touch more generally on the broad field of medical informatics. To the extent that AIM research is driven by performance goals for biomedicine, AIM is simply one component within a wide range of research and development activities. Furthermore, an adequate appraisal of AIM research requires an understanding of the research motivations, the complexity of the problems, and a suitable definition of the criteria for judging the field's success. Effective fielding of AIM systems will be dependent on the development of integrated environments for communication and computing that allow merging of knowledge-based tools with other patient data-management and information-retrieval applications. The creation of this kind of infrastructure will require vision and resources from leaders who realize that the practice of medicine is inherently an information-management task and that biomedicine must make the same kind of coordinated commitment to computing technologies as have other segments of our society in which the importance of information management is well understood.
Progress notes are narrative summaries about the status of patients during the course of treatment or care. Time and efficiency pressures have ensured clinicians' continued preference for unstructured text over entering data in forms when composing progress notes. The ability to extract meaningful data from the unstructured text contained within the notes is invaluable for retrospective analysis and decision support. The automatic extraction of data from unstructured notes, however, has been largely prevented due to the complexity of handling abbreviations, misspelling, punctuation errors and other types of noise.
We present a robust system for cleaning noisy progress notes in real-time, with a focus on abbreviations and misspellings.
The system uses statistical semantic analysis based on Web data and the occasional participation of clinicians to automatically replace abbreviations with the actual senses and misspellings with the correct words.
An accuracy of as high as 88.73% was achieved based only on statistical semantic analysis using Web data. The response time of the system with the caching mechanism enabled is 1.5-2s per word which is about the same as the average typing speed of clinicians.
The overall accuracy and the response time of the system will improve with time, especially when the confidence mechanism is activated through clinicians' interactions with the system. This system will be implemented in a clinical information system to drive interactive decision support and analysis functions leading to improved patient care and outcomes.
Computed tomography images are becoming an invaluable mean for abdominal organ investigation. In the field of medical image processing, some of the current interests are the automatic diagnosis of liver, spleen, and kidney pathologies, and the 3D volume rendering of these abdominal organs. Their automatic segmentation is the first and fundamental step in all these studies, but it is still an open problem.
In this paper we propose a fully automatic, gray-level based segmentation framework based on a multiplanar fast marching method. The proposed segmentation scheme is general, and employs only established and not critical anatomical knowledge. For this reason, it can be easily adapted to segment different abdominal organs, by overcoming problems due to the high inter- and intra-patient gray-level, and shape variabilities; the extracted volumes are then combined to produce the final results.
The system has been evaluated by computing the symmetric volume overlap (SVO) between the automatically segmented (liver and spleen) volumes and the volumes manually traced by radiological experts. The test dataset is composed of 60 images, where 40 images belong to a private dataset, and 20 images to a public one. Liver segmentation has achieved an average SVO congruent with94, which is comparable to the mean intra- and inter-personal variation (96). Spleen segmentation achieves similar, promising results (SVO congruent with93). The comparison of these results with those achieved by active contour models (SVO congruent with90), and topology adaptive snakes (SVO congruent with92) proves the efficacy of our system.
The described segmentation method is a general framework that can be adapted to segment different abdominal organs, achieving promising segmentation results. It has to be noted that its performance could be further improved by incorporating shape based rules.
Clinical diagnosis in acute abdominal pain is still a major problem. Computer-aided diagnosis offers some help; however, existing systems still produce high error rates. We therefore tested machine learning techniques in order to improve standard statistical systems. The investigation was based on a prospective clinical database with 1254 cases, 46 diagnostic parameters and 15 diagnoses. Independence Bayes and the automatic rule induction techniques ID3, NewId, PRISM, CN2, C4.5 and ITRULE were trained with 839 cases and separately tested on 415 cases. No major differences in overall accuracy were observed (43-48%), except for NewId, which was below the average. Between the different techniques some similarities were found, but also considerable differences with respect to specific diagnoses. Machine learning techniques did not improve the results of the standard model Independence Bayes. Problem dimensionality, sample size and model complexity are major factors influencing diagnostic accuracy in computer-aided diagnosis of acute abdominal pain.
Learning from patient records may aid knowledge acquisition and decision making. Existing inductive machine learning (ML) systems such us NewId, CN2, C4.5 and AQ15 learn from past case histories using symbolic and/or numeric values. These systems learn symbolic rules (IF... THEN like) which link an antecedent set of clinical factors to a consequent class or decision. This paper compares the learning performance of alternative ML systems with each other and with respect to a novel approach using logic minimization, called LML, to learn from data. Patient cases were taken from the archives of the Paediatric Surgery Clinic of the University Hospital of Crete, Heraklion, Greece. Comparison of ML system performance is based both on classification accuracy and on informal expert assessment of learned knowledge.
The paper describes conception and prototypical design of a decision-support server for acute abdominal pain. Existing formal methods to develop and exchange scores, guidelines and algorithms are used for integration. For scoring systems a work-up to separate terminological information from structure is described. The terminology is separately stored in a data dictionary and the structure in a knowledge base. This procedure enables a reuse of terminology for documentation and decision-support. The whole system covers a decision-support server written in C++ with underlying data dictionary and knowledge base, a documentation module written in Java and a CORBA middleware that establishes a connection via Internet.
In this study different substitution methods for the replacement of missing data values were inspected for the use of these cases in a neural network based decision support system for acute appendicitis. The leucocyte count had the greatest number of missing values and was used in the analyses. Four different methods were compared: substituting means, random values, nearest neighbour and a neural network. There were great differences in the substituted leucocyte count values between different methods and only nearest neighbour and neural network agreed about most of the cases. The importance of the substitution method for the final diagnostic classification of the patients by the neural network based decision support system was found to be small.
It is difficult to assess hypothetical models in poorly measured domains such as neuroendocrinology. Without a large library of observations to constrain inference, the execution of such incomplete models implies making assumptions. Mutually exclusive assumptions must be kept in separate worlds. We define a general abductive multiple-worlds engine that assesses such models by (i) generating the worlds and (ii) tests if these worlds contain known behaviour. World generation is constrained via the use of relevant envisionment. We describe QCM, a modeling language for compartmental models that can be processed by this inference engine. This tool has been used to find faults in theories published in international refereed journals; i.e. QCM can detect faults which are invisible to other methods. The generality and computational limits of this approach are discussed. In short, this approach is applicable to any representation that can be compiled into an and-or graph, provided the graphs are not too big or too intricate (fanout < 7).
Dysphagia assessment involves diagnosis of individual swallows in terms of the depth of airway invasion and degree of bolus clearance. The videofluoroscopic swallowing study is the current gold standard for dysphagia assessment but is time-consuming and costly. An ideal alternative would be an automated abnormal swallow detection methodology based on non-invasive signals.
Building upon promising results from single-axis cervical accelerometry, the objective of this study was to investigate the combination of dual-axis accelerometry and nasal airflow for classification of healthy and abnormal swallows in a patient population with dysphagia.
Signals were acquired from 24 adult patients with dysphagia (17.8±8.8 swallows per patient). The abnormality of each swallow was quantified using 4-point videofluoroscopic rating scales for its depth of airway invasion, bolus clearance from the valleculae, and bolus clearance from the pyriform sinuses. For each scale, we endeavored to automatically discriminate between the 2 extreme ratings, yielding 3 separate binary classification problems. Various time, frequency, and time-frequency domain features were extracted. A genetic algorithm was deployed for feature selection. Smoothed bootstrapping was utilized to balance the two classes and provide sufficient training data for a multidimensional feature space.
A Euclidean linear discriminant classifier resulted in a mean adjusted accuracy of 74.7% for the depth of airway invasion rating, whereas Mahalanobis linear discriminant classifiers yielded mean adjusted accuracies of 83.7% and 84.2% for bolus clearance from the valleculae and pyriform sinuses, respectively. The bolus clearance from the valleculae problem required the lowest feature space dimensionality. Wavelet features were found to be most discriminatory.
This exploratory study confirms that dual-axis accelerometry and nasal airflow signals can be used to discriminate healthy and abnormal swallows from patients with dysphagia. The fact that features from all signal channels contributed discriminatory information suggests that multi-sensor fusion is promising in abnormal swallow detection.
The main objective of this paper is to present a novel learning algorithm for the classification of mass abnormalities in digitized mammograms.
The proposed approach consists of new network architecture and a new learning algorithm. The original idea is based on the introduction of an additional neuron in the hidden layer for each output class. The additional neurons for benign and malignant classes help in improving memorization ability without destroying the generalization ability of the network. The training is conducted by combining minimal distance-based similarity/random weights and direct calculation of output weights.
The proposed approach can memorize training patterns with 100% retrieval accuracy as well as achieve high generalization accuracy for patterns which it has never seen before. The grey-level and breast imaging reporting and data system-based features from digitized mammograms are extracted and used to train the network with the proposed architecture and learning algorithm. The best results achieved by using the proposed approach are 100% on training set and 94% on test set.
The proposed approach produced very promising results. It has outperformed existing classification approaches in terms of classification accuracy, generalization and memorization abilities, number of iterations, and guaranteed training on a benchmark database.
This paper proposes a novel approach to cardiac arrhythmia recognition from electrocardiograms (ECGs). ECGs record the electrical activity of the heart and are used to diagnose many heart disorders. The numerical ECG is first temporally abstracted into series of time-stamped events. Temporal abstraction makes use of artificial neural networks to extract interesting waves and their features from the input signals. A temporal reasoner called a chronicle recogniser processes such series in order to discover temporal patterns called chronicles which can be related to cardiac arrhythmias. Generally, it is difficult to elicit an accurate set of chronicles from a doctor. Thus, we propose to learn automatically from symbolic ECG examples the chronicles discriminating the arrhythmias belonging to some specific subset. Since temporal relationships are of major importance, inductive logic programming (ILP) is the tool of choice as it enables first-order relational learning. The approach has been evaluated on real ECGs taken from the MIT-BIH database. The performance of the different modules as well as the efficiency of the whole system is presented. The results are rather good and demonstrate that integrating numerical techniques for low level perception and symbolic techniques for high level classification is very valuable.
In this work, we deal with temporal abstraction of clinical data. Abstractions are, for example, blood pressure state (e.g. normal, high, low) and trend (e.g. increasing, decreasing and stationary) over time intervals. The goal of our work is to provide clinicians with automatic tools to extract high-level, concise, important features of available collections of time-stamped clinical data. This capability is especially important when the available collections constantly increase in size, as in long-term clinical follow-up, leading to information overload. The approach we propose exploits the integration of the deductive and object-oriented approaches in clinical databases. The main result of this work is an object-oriented data model based on the event calculus to support temporal abstraction. The proposed approach has been validated building the CARDIOTABS system for the abstraction of clinical data collected during echocardiographic tests.
Intelligent clinical data analysis systems require precise qualitative descriptions of data to enable effective and context sensitive interpretation to take place. Temporal abstraction (TA) provides the means to achieve such descriptions, which can then be used as input to a reasoning engine where they are evaluated against a knowledge base to arrive at possible clinical hypotheses. This paper surveys previous research into the development of intelligent clinical data analysis systems that incorporate TA mechanisms and presents research synergies and trends across the research reviewed, especially those associated with the multi-dimensional nature of real-time patient data streams. The motivation for this survey is case study based research into the development of an intelligent real-time, high-frequency patient monitoring system to provide detection of temporal patterns within multiple patient data streams.
The survey was based on factors that are of importance to broaden research into temporal abstraction and on characteristics we believe will assume an increasing level of importance for future clinical IDA systems. These factors were: aspects of the data that is abstracted such as source domain and sample frequency, complexity available within abstracted patterns, dimensionality of the TA and data environment and the knowledge and reasoning underpinning TA processes.
It is evident from the review that for intelligent clinical data analysis systems to progress into the future where clinical environments are becoming increasingly data-intensive, the ability for managing multi-dimensional aspects of data at high observation and sample frequencies must be provided. Also, the detection of complex patterns within patient data requires higher levels of TA than are presently available. The conflicting matters of computational tractability and temporal reasoning within a real-time environment present a non-trivial problem for investigation in regard to these matters. Finally, to be able to fully exploit the value of learning new knowledge from stored clinical data through data mining and enable its application to data abstraction, the fusion of data mining and TA processes becomes a necessity.
I describe the Temporal Control System (TCS), a programming system designed for building intelligent temporal monitoring programs. The ICU data set provided as part of the 1994 AAAI Spring Symposium challenge is used to conduct several experiments. Empirical results from the ICU data set validate the scalable design of the TCS. The remaining experiments examine the computational problem of generating interval values from sample points through persistent assumptions. Using abstractions in combination with persistence assumptions makes the design of higher-level clinical reasoning programs simpler. Abstraction can be used to suppress clinically unimportant details, allowing an expert system to focus on the key information provided by clinical monitors. The TCS provides the framework for the implementation as well as a method of calculating the 'cost' of different approaches. To prevent the use of outdated information, it is often useful to limit the time span of a persistent interval. I show that such limitations can be very costly computationally and then show how the application of symbolic abstraction can help. Further performance improvements from switching from continuous to discrete step persistence are shown. These performance enhancing techniques have general applicability.
To compare two temporal abstraction procedures for the extraction of meta features from monitoring data. Feature extraction prior to predictive modeling is a common strategy in prediction from temporal data. A fundamental dilemma in this strategy, however, is the extent to which the extraction should be guided by domain knowledge, and to which extent it should be guided by the available data. The two temporal abstraction procedures compared in this case study differ in this respect.
The first temporal abstraction procedure derives symbolic descriptions from the data that are predefined using existing concepts from the medical language. In the second procedure, a large space of numerical meta features is searched through to discover relevant features from the data. These procedures were applied to a prediction problem from intensive care monitoring data. The predictive value of the resulting meta features were compared, and based on each type of features, a class probability tree model was developed.
The numerical meta features extracted by the second procedure were found to be more informative than the symbolic meta features of the first procedure in the case study, and a superior predictive performance was observed for the associated tree model.
The findings indicate that for prediction from monitoring data, induction of numerical meta features from data is preferable to extraction of symbolic meta features using existing clinical concepts.
We have defined a knowledge-based framework for the creation of abstract, interval-based concepts from time-stamped clinical data, the knowledge-based temporal-abstraction (KBTA) method. The KBTA method decomposes its task into five subtasks; for each subtask we propose a formal solving mechanism. Our framework emphasizes explicit representation of knowledge required for abstraction of time-oriented clinical data, and facilitates its acquisition, maintenance, reuse and sharing. The RESUME system implements the KBTA method. We tested RESUME in several clinical-monitoring domains, including the domain of monitoring patients who have insulin-dependent diabetes. We acquired from a diabetes-therapy expert diabetes-therapy temporal-abstraction knowledge. Two diabetes-therapy experts (including the first one) created temporal abstractions from about 800 points of diabetic-patients' data. RESUME generated about 80% of the abstractions agreed by both experts; about 97% of the generated abstractions were valid. We discuss the advantages and limitations of the current architecture.
Medical diagnosis and therapy planning at modern intensive care units (ICUs) have been refined by the technical improvement of their equipment. However, the bulk of continuous data arising from complex monitoring systems in combination with discontinuously assessed numerical and qualitative data creates a rising information management problem at neonatal ICUs (NICUs). We developed methods for data validation and therapy planning which incorporate knowledge about point and interval data, as well as expected qualitative trend descriptions to arrive at unified qualitative descriptions of parameters (temporal data abstraction). Our methods are based on schemata for data-point transformation and curve fitting which express the dynamics of and the reactions to different degrees of parameters' abnormalities as well as on smoothing and adjustment mechanisms to keep the qualitative descriptions stable. We show their applicability in detecting anomalous system behavior early, in recommending therapeutic actions, and in assessing the effectiveness of these actions within a certain period. We implemented our methods in VIE-VENT, an open-loop knowledge-based monitoring and therapy planning system for artificially ventilated newborn infants. The applicability and usefulness of our approach are illustrated by examples of VIE-VENT. Finally, we present our first experiences with using VIE-VENT in a real clinical setting.
The specification and creation of a distributed system that integrates medical knowledge bases with time-oriented clinical databases; the goal is to answer complex temporal queries regarding both raw data and its abstractions, such as are often required in medical applications.
(1) Specification, design, and implementation of a generalized access method to a set of heterogeneous clinical data sources, by using a virtual medical-record interface and by mapping the local terms to a set of standardized medical vocabularies; (2) specification of a generalized interface to a set of knowledge sources; (3) specification and implementation of a service, called ALMA that computes complex time-oriented medical queries that include both raw data and abstractions derivable from it; (4) design and implementation of a mediator, called IDAN, that answers raw-data and abstract queries by integrating the appropriate clinical data with the relevant medical knowledge and uses the computation service to answer the queries; (5) an expressive language that enables definition of time-dependent medical queries, which are referred to the mediator; (6) evaluation of the effect of the system, when combined with a new visual interface, called KNAVE-II, on the speed and accuracy of answering a set of complex queries in an oncology sub domain, by a group of clinicians, compared to answering these queries using paper or an electronic spreadsheet.
We have implemented the full IDAN architecture. The IDAN/KNAVE-II combination significantly increased the accuracy and speed of answering complex queries about both the data and their abstractions, compared to the standard tools.
The implemented architecture proves the feasibility of the distributed integration of medical knowledge sources with clinical data of heterogeneous sources. The results suggest that the proposed IDAN modular architecture has potential significance for supporting the automation of clinical tasks such as diagnosis, monitoring, therapy, and quality assessment.
We present KNAVE-II, an intelligent interface to a distributed architecture specific to the tasks of query, knowledge-based interpretation, summarization, visualization, interactive exploration of large numbers of distributed time-oriented clinical data, and dynamic sensitivity analysis of these data. KNAVE-II main contributions to the fields of temporal reasoning and intelligent user interfaces are: (1) the capability for interactive computation and visualization of domain specific temporal abstractions, supported by ALMA--a computational engine that applies the domain knowledge base to the clinical time-oriented database. (2) Semantic (ontology-based) navigation and exploration of the data, knowledge, and temporal abstractions, supported by the IDAN mediator, a distributed architecture that enables runtime access to domain-specific knowledge bases that are maintained by expert physicians.
KNAVE-II was designed according to 12 requirements that were defined through iterative cycles of design and user-centered evaluation. The complete architecture has been implemented and evaluated in a cross-over study design that compared the KNAVE-II module versus two existing methods: paper charts and an Excel electronic spreadsheet. A small group of clinicians answered the same queries, using the domain of oncology and a set of 1000 patients followed after bone-marrow transplantation.
The results show that users are able to perform medium to hard difficulty level queries faster and more accurately by using KNAVE-II than paper charts and Excel. Moreover, KNAVE-II was ranked first in preference by all users, along all usability dimensions.
Initial evaluation of KNAVE-II and its supporting knowledge based temporal-mediation architecture, by applying it to a large data base of patients monitored several years after bone marrow transplantation (BMT), has produced highly encouraging results.
The main goal of this work is to propose a framework for the visual specification and query of consistent multi-granular clinical temporal abstractions. We focus on the issue of querying patient clinical information by visually defining and composing temporal abstractions, i.e., high level patterns derived from several time-stamped raw data. In particular, we focus on the visual specification of consistent temporal abstractions with different granularities and on the visual composition of different temporal abstractions for querying clinical databases.
Temporal abstractions on clinical data provide a concise and high-level description of temporal raw data, and a suitable way to support decision making. Granularities define partitions on the time line and allow one to represent time and, thus, temporal clinical information at different levels of detail, according to the requirements coming from the represented clinical domain. The visual representation of temporal information has been considered since several years in clinical domains. Proposed visualization techniques must be easy and quick to understand, and could benefit from visual metaphors that do not lead to ambiguous interpretations. Recently, physical metaphors such as strips, springs, weights, and wires have been proposed and evaluated on clinical users for the specification of temporal clinical abstractions. Visual approaches to boolean queries have been considered in the last years and confirmed that the visual support to the specification of complex boolean queries is both an important and difficult research topic.
We propose and describe a visual language for the definition of temporal abstractions based on a set of intuitive metaphors (striped wall, plastered wall, brick wall), allowing the clinician to use different granularities. A new algorithm, underlying the visual language, allows the physician to specify only consistent abstractions, i.e., abstractions not containing contradictory conditions on the component abstractions. Moreover, we propose a visual query language where different temporal abstractions can be composed to build complex queries: temporal abstractions are visually connected through the usual logical connectives AND, OR, and NOT.
The proposed visual language allows one to simply define temporal abstractions by using intuitive metaphors, and to specify temporal intervals related to abstractions by using different temporal granularities. The physician can interact with the designed and implemented tool by point-and-click selections, and can visually compose queries involving several temporal abstractions. The evaluation of the proposed granularity-related metaphors consisted in two parts: (i) solving 30 interpretation exercises by choosing the correct interpretation of a given screenshot representing a possible scenario, and (ii) solving a complex exercise, by visually specifying through the interface a scenario described only in natural language. The exercises were done by 13 subjects. The percentage of correct answers to the interpretation exercises were slightly different with respect to the considered metaphors (54.4--striped wall, 73.3--plastered wall, 61--brick wall, and 61--no wall), but post hoc statistical analysis on means confirmed that differences were not statistically significant. The result of the user's satisfaction questionnaire related to the evaluation of the proposed granularity-related metaphors ratified that there are no preferences for one of them. The evaluation of the proposed logical notation consisted in two parts: (i) solving five interpretation exercises provided by a screenshot representing a possible scenario and by three different possible interpretations, of which only one was correct, and (ii) solving five exercises, by visually defining through the interface a scenario described only in natural language. Exercises had an increasing difficulty. The evaluation involved a total of 31 subjects. Results related to this evaluation phase confirmed us about the soundness of the proposed solution even in comparison with a well known proposal based on a tabular query form (the only significant difference is that our proposal requires more time for the training phase: 21 min versus 14 min).
Discussion and conclusions:
In this work we have considered the issue of visually composing and querying temporal clinical patient data. In this context we have proposed a visual framework for the specification of consistent temporal abstractions with different granularities and for the visual composition of different temporal abstractions to build (possibly) complex queries on clinical databases. A new algorithm has been proposed to check the consistency of the specified granular abstraction. From the evaluation of the proposed metaphors and interfaces and from the comparison of the visual query language with a well known visual method for boolean queries, the soundness of the overall system has been confirmed; moreover, pros and cons and possible improvements emerged from the comparison of different visual metaphors and solutions.
To determine whether the automatic classification of documents can be useful in systematic reviews on medical topics, and specifically if the performance of the automatic classification can be enhanced by using the particular protocol of questions employed by the human reviewers to create multiple classifiers.
The test collection is the data used in large-scale systematic review on the topic of the dissemination strategy of health care services for elderly people. From a group of 47,274 abstracts marked by human reviewers to be included in or excluded from further screening, we randomly selected 20,000 as a training set, with the remaining 27,274 becoming a separate test set. As a machine learning algorithm we used complement naïve Bayes. We tested both a global classification method, where a single classifier is trained on instances of abstracts and their classification (i.e., included or excluded), and a novel per-question classification method that trains multiple classifiers for each abstract, exploiting the specific protocol (questions) of the systematic review. For the per-question method we tested four ways of combining the results of the classifiers trained for the individual questions. As evaluation measures, we calculated precision and recall for several settings of the two methods. It is most important not to exclude any relevant documents (i.e., to attain high recall for the class of interest) but also desirable to exclude most of the non-relevant documents (i.e., to attain high precision on the class of interest) in order to reduce human workload.
For the global method, the highest recall was 67.8% and the highest precision was 37.9%. For the per-question method, the highest recall was 99.2%, and the highest precision was 63%. The human-machine workflow proposed in this paper achieved a recall value of 99.6%, and a precision value of 17.8%.
The per-question method that combines classifiers following the specific protocol of the review leads to better results than the global method in terms of recall. Because neither method is efficient enough to classify abstracts reliably by itself, the technology should be applied in a semi-automatic way, with a human expert still involved. When the workflow includes one human expert and the trained automatic classifier, recall improves to an acceptable level, showing that automatic classification techniques can reduce the human workload in the process of building a systematic review.
In this study, a methodology is presented for an automated levodopa-induced dyskinesia (LID) assessment in patients suffering from Parkinson's disease (PD) under real-life conditions.
The methodology is based on the analysis of signals recorded from several accelerometers and gyroscopes, which are placed on the subjects' body while they were performing a series of standardised motor tasks as well as voluntary movements. Sixteen subjects were enrolled in the study. The recordings were analysed in order to extract several features and, based on these features, a classification technique was used for LID assessment, i.e. detection of LID symptoms and classification of their severity.
The results were compared with the clinical annotation of the signals, provided by two expert neurologists. The analysis was performed related to the number and topology of sensors used; several different experimental settings were evaluated while a 10-fold stratified cross validation technique was employed in all cases. Moreover, several different classification techniques were examined. The ability of the methodology to be generalised was also evaluated using leave-one-patient-out cross validation. The sensitivity and positive predictive values (average for all LID severities) were 80.35% and 76.84%, respectively.
The proposed methodology can be applied in real-life conditions since it can perform LID assessment in recordings which include various PD symptoms (such as tremor, dyskinesia and freezing of gait) of several motor tasks and random voluntary movements.
While clinical trials offer cancer patients the optimum treatment, historical accrual of such patients has not been very successful. OncoDoc is a decision support system designed to provide best therapeutic recommendations for breast cancer patients. Developed as a browsing tool of a knowledge base structured as a decision tree, OncoDoc allows physicians to control the contextual instantiation of patient characteristics to build the best formal equivalent of an actual patient. Used as a computer-based eligibility screening system, depending on whether instantiated patient parameters are matched against guideline knowledge or available clinical trial protocols, it provides either evidence-based therapeutic options or relevant patient-specific clinical trials. Implemented at the Gustave Roussy Institute and routinely used at the point of care during a 4-month period, it significantly improved physician compliance with guideline recommendations and enhanced physician awareness of open trials while increasing patient enrollment to clinical trials by 50%. But, when analyzing reasons of non-accrual of potentially eligible patients, it appeared that physicians' psychological reluctance to refer patients to clinical trials, measured during the experiment at 25%, may not be resolved by the simple dissemination of clinical trial information at the point of care.
This paper presents the results obtained with the innovative use of special types of artificial neural networks (ANNs) assembled in a novel methodology named IFAST (implicit function as squashing time) capable of compressing the temporal sequence of electroencephalographic (EEG) data into spatial invariants. The aim of this study is to assess the potential of this parallel and nonlinear EEG analysis technique in distinguishing between subjects with mild cognitive impairment (MCI) and Alzheimer's disease (AD) patients with a high degree of accuracy in comparison with standard and advanced nonlinear techniques. The principal aim of the study was testing the hypothesis that automatic classification of MCI and AD subjects can be reasonably correct when the spatial content of the EEG voltage is properly extracted by ANNs.
Methods and material:
Resting eyes-closed EEG data were recorded in 180 AD patients and in 115 MCI subjects. The spatial content of the EEG voltage was extracted by IFAST step-wise procedure using ANNs. The data input for the classification operated by ANNs were not the EEG data, but the connections weights of a nonlinear auto-associative ANN trained to reproduce the recorded EEG tracks. These weights represented a good model of the peculiar spatial features of the EEG patterns at scalp surface. The classification based on these parameters was binary (MCI versus AD) and was performed by a supervised ANN. Half of the EEG database was used for the ANN training and the remaining half was utilised for the automatic classification phase (testing).
The best results distinguishing between AD and MCI reached to 92.33%. The comparative results obtained with the best method so far described in the literature, based on blind source separation and Wavelet pre-processing, were 80.43% (p<0.001).
The results confirmed the working hypothesis that a correct automatic classification of MCI and AD subjects can be obtained extracting spatial information content of the resting EEG voltage by ANNs and represent the basis for research aimed at integrating spatial and temporal information content of the EEG.
One of the hardest technical tasks in employing Bayesian network models in practice is obtaining their numerical parameters. In the light of this difficulty, a pressing question, one that has immediate implications on the knowledge engineering effort, is whether precision of these parameters is important. In this paper, we address experimentally the question whether medical diagnostic systems based on Bayesian networks are sensitive to precision of their parameters.
Methods and materials:
The test networks include Hepar II, a sizeable Bayesian network model for diagnosis of liver disorders and six other medical diagnostic networks constructed from medical data sets available through the Irvine Machine Learning Repository. Assuming that the original model parameters are perfectly accurate, we lower systematically their precision by rounding them to progressively courser scales and check the impact of this rounding on the models' accuracy.
Our main result, consistent across all tested networks, is that imprecision in numerical parameters has minimal impact on the diagnostic accuracy of models, as long as we avoid zeroes among parameters.
The experiments' results provide evidence that as long as we avoid zeroes among model parameters, diagnostic accuracy of Bayesian network models does not suffer from decreased precision of their parameters.
Psychometrical questionnaires such as the Barrat's impulsiveness scale version 11 (BIS-11) have been used in the assessment of suicidal behavior. Traditionally, BIS-11 items have been considered as equally valuable but this might not be true. The main objective of this article is to test the discriminative ability of the BIS-11 and the international personality disorder evaluation screening questionnaire (IPDE-SQ) to predict suicide attempter (SA) status using different classification techniques. In addition, we examine the discriminative capacity of individual items from both scales.
Two experiments aimed at evaluating the accuracy of different classification techniques were conducted. The answers of 879 individuals (345 SA, 384 healthy blood donors, and 150 psychiatric inpatients) to the BIS-11 and IPDE-SQ were used to compare the classification performance of two techniques that have successfully been applied in pattern recognition issues, Boosting and support vector machines (SVM) with respect to linear discriminant analysis, Fisher linear discriminant analysis, and the traditional psychometrical approach.
The most discriminative BIS-11 and IPDE-SQ items are "I am self controlled" (Item 6) and "I often feel empty inside" (item 40), respectively. The SVM classification accuracy was 76.71% for the BIS-11 and 80.26% for the IPDE-SQ.
The IPDE-SQ items have better discriminative abilities than the BIS-11 items for classifying SA. Moreover, IPDE-SQ is able to obtain better SA and non-SA classification results than the BIS-11. In addition, SVM outperformed the other classification techniques in both questionnaires.
One of the main problems in cancer diagnosis by using DNA microarray data is selecting genes relevant for the pathology by analyzing their expression profiles in tissues in two different phenotypical conditions. The question we pose is the following: how do we measure the relevance of a single gene in a given pathology?
A gene is relevant for a particular disease if we are able to correctly predict the occurrence of the pathology in new patients on the basis of its expression level only. In other words, a gene is informative for the disease if its expression levels are useful for training a classifier able to generalize, that is, able to correctly predict the status of new patients. In this paper we present a selection bias free, statistically well founded method for finding relevant genes on the basis of their classification ability.
We applied the method on a colon cancer data set and produced a list of relevant genes, ranked on the basis of their prediction accuracy. We found, out of more than 6500 available genes, 54 overexpressed in normal tissues and 77 overexpressed in tumor tissues having prediction accuracy greater than 70% with p-value <or=0.05.
The relevance of the selected genes was assessed (a) statistically, evaluating the p-value of the estimate prediction accuracy of each gene; (b) biologically, confirming the involvement of many genes in generic carcinogenic processes and in particular for the colon; (c) comparatively, verifying the presence of these genes in other studies on the same data-set.
Classification algorithms can be used to predict risks and responses of patients based on genomic and other high-dimensional data. While there is optimism for using these algorithms to improve the treatment of diseases, they have yet to demonstrate sufficient predictive ability for routine clinical practice. They generally classify all patients according to the same criteria, under an implicit assumption of population homogeneity. The objective here is to allow for population heterogeneity, possibly unrecognized, in order to increase classification accuracy and further the goal of tailoring therapies on an individualized basis.
A new selective-voting algorithm is developed in the context of a classifier ensemble of two-dimensional convex hulls of positive and negative training samples. Individual classifiers in the ensemble are allowed to vote on test samples only if those samples are located within or behind pruned convex hulls of training samples that define the classifiers.
Validation of the new algorithm's increased accuracy is carried out using two publicly available datasets having cancer as the outcome variable and expression levels of thousands of genes as predictors. Selective voting leads to statistically significant increases in accuracy from 86.0% to 89.8% (p<0.001) and 63.2% to 67.8% (p<0.003) compared to the original algorithm.
Selective voting by members of convex-hull classifier ensembles significantly increases classification accuracy compared to one-size-fits-all approaches.
One of interesting computational topics in bioinformatics is prediction of secondary structure of proteins. Over 30 years of research has been devoted to the topic but we are still far away from having reliable prediction methods. A critical piece of information for accurate prediction of secondary structure is the helix and strand content of a given protein sequence. Ability to accurately predict content of those two secondary structures has a good potential to improve accuracy of prediction of the secondary structure. Most of the existing methods use composition vector to predict the content. Their underlying assumption is that the vector can be used to provide functional mapping between primary sequence and helix/strand content. While this is true for small sets of proteins we show that for larger protein sets such mapping are inconsistent, i.e. the same composition vectors correspond to different contents. To this end, we propose a method for prediction of helix/strand content from primary protein sequences that is fundamentally different from currently available methods.
In our previous work, we have presented an algorithm that extracts classification rules from trained neural networks and discussed its application to breast cancer diagnosis. In this paper, we describe how the accuracy of the networks and the accuracy of the rules extracted from them can be improved by a simple pre-processing of the data. Data pre-processing involves selecting the relevant input attributes and removing those samples with missing attribute values. The rules generated by our neural network rule extraction algorithm are more concise and accurate than those generated by other rule generating methods reported in the literature.