Article

Feature Selection for Interpatient Supervised Heart Beat Classification.

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Moreover, it is important to employ the most suitable classifier for detection purposes. We propose techniques to accurately classify heartbeats by training a classifier using a small feature set (consisting of features in time domain, frequency domain and ECG morphology) obtained by employing a feature selection method [5] [6]. The paper is organized as follows. ...
... Armed with the 31 dimensional feature vector, we apply a dimensionality reduction method to identify a subset of features most correlated to the class label. The feature selection method is known as Incremental wrapper approach [5], [6]. In statistical learning, wrapper methods treat feature selection as a search problem. ...
Article
We present algorithms for the detection of a class of heart arrhythmias with the goal of eventual adoption by practicing cardiologists. In clinical practice, detection is based on a small number of meaningful features extracted from the heartbeat cycle. However, techniques proposed in the literature use high dimensional vectors consisting of morphological, and time based features for detection. Using electrocardiogram (ECG) signals, we found smaller subsets of features sufficient to detect arrhythmias with high accuracy. The features were found by an iterative step-wise feature selection method. We depart from common literature in the following aspects: 1. As opposed to a high dimensional feature vectors, we use a small set of features with meaningful clinical interpretation, 2. we eliminate the necessity of short-duration patient-specific ECG data to append to the global training data for classification 3. We apply semi-parametric classification procedures (in an ensemble framework) for arrhythmia detection, and 4. our approach is based on a reduced sampling rate of ~ 115 Hz as opposed to 360 Hz in standard literature.
Article
Full-text available
Decision support systems can seriously help medical doctors in the diagnosis of different diseases, especially in complicated cases. This article is devoted to recognizing and diagnosing heart disease based on automatic computer processing of the electrocardiograms (ECG) of patients. In the general case, the change of the ECG parameters can be presented as a random sequence of the signals under processing. Developing new computational methods for such signal processing is an important research problem in creating efficient medical decision support systems. Authors consider the possibility of increasing the diagnostic accuracy of cardiovascular diseases by implementing of the new proposed computational method of information processing. This method is based on the generalized nonlinear canonical decomposition of a random sequence of the change of cardiogram parameters. The use of a nonlinear canonical model makes it possible to significantly simplify the maximum likelihood criterion for classifying diseases. This simplification is provided by the transition from a multi-dimensional distribution density of cardiogram parameters to a product of one-dimensional distribution densities of independent random coefficients of a nonlinear canonical decomposition. The absence of any restrictions on the class of random sequences under study makes it possible to achieve maximum accuracy in diagnosing cardiovascular diseases. Functional diagrams for implementing the proposed method reflecting the features of its application are presented. The quantitative parameters of the core of the computational diagnostic procedure can be determined in advance based on the preliminary statistical data of the ECGs for different heart diseases. That is why the developed method is quite simple in terms of computation (computing complexity, accuracy, computing time, etc.) and can be implemented in medical computer decision systems for monitoring cardiovascular diseases and for their diagnosis in real time. The results of the numerical experiment confirm the high accuracy of the developed method for classifying cardiovascular diseases.
Article
Full-text available
Accurate detection of cardiac pathological events is an important part of electrocardiogram (ECG) evaluation and subsequent correct treatment of the patient. The paper introduces the results of a complex study, where various aspects of automatic classification of various heartbeat types have been addressed. Particularly, non-ischemic, ischemic (of two different grades) and subsequent ventricular premature beats were classified in this combination for the first time. ECGs recorded in rabbit isolated hearts under non-ischemic and ischemic conditions were used for analysis. Various morphological and spectral features (both commonly used and newly proposed) as well as classification models were tested on the same data set. It was found that: a) morphological features are generally more suitable than spectral ones; b) successful results (accuracy up to 98.3% and 96.2% for morphological and spectral features, respectively) can be achieved using features calculated without time-consuming delineation of QRS-T segment; c) use of reduced number of features (3 to 14 features) for model training allows achieving similar or even better performance as compared to the whole feature sets (10 to 29 features); d) k-nearest neighbours and support vector machine seem to be the most appropriate models (accuracy up to 98.6% and 93.5%, respectively).
Article
Full-text available
The paper presents an original filter approach for effective feature selection in microarray data characterized by a large number of input variables and a few samples. The approach is based on the use of a new information-theoretic selection, the double input symmetrical relevance (DISR), which relies on a measure of variable complementarity. This measure evaluates the additional information that a set of variables provides about the output with respect to the sum of each single variable contribution. We show that a variable selection approach based on DISR can be formulated as a quadratic optimization problem: the dispersion sum problem (DSP). To solve this problem, we use a strategy based on backward elimination and sequential replacement (BESR). The combination of BESR and the DISR criterion is compared in theoretical and experimental terms to recently proposed information-theoretic criteria. Experimental results on a synthetic dataset as well as on a set of eleven microarray classification tasks show that the proposed technique is competitive with existing filter selection methods.
Article
Data from spectrophotometers form vectors of a large number of exploitable variables. Building quantitative models using these variables most often requires using a smaller set of variables than the initial one. Indeed, a too large number of input variables to a model results in a too large number of parameters, leading to overfitting and poor generalization abilities. In this paper, we suggest the use of the mutual information measure to select variables from the initial set. The mutual information measures the information content in input variables with respect to the model output, without making any assumption on the model that will be used; it is thus suitable for nonlinear modelling. In addition, it leads to the selection of variables among the initial set, and not to linear or nonlinear combinations of them. Without decreasing the model performances compared to other variable projection methods, it allows therefore a greater interpretability of the results.