Article

Bayesian Machine Learning for EEG/MEG Signal Processing

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

EEG and MEG are the most common noninvasive brain imaging techniques for monitoring the electrical brain activity and inferring the brain function. The central goal of EEG/MEG analysis is to extract informative brain spatio-temporal-spectral patterns or to infer functional connectivity between different brain areas, which are directly useful for neuroscience or clinical investigations. Due to its potentially complex nature (such as nonstationarity, high-dimensionality, subject variability, low signal-to-noise ratio), EEG/MEG signal processing poses some great challenges for researchers. These challenges can be addressed in a principled manner via Bayesian machine learning (BML). BML is an emerging field that integrates Bayesian statistics, variational methods, and machine learning techniques to solve various problems from regression, prediction, outlier detection, feature extraction and classification. BML has recently gained increasing attention and widespread successes in signal processing and big data analytics, such as in source reconstruction, compressed sensing, and information fusion. To review recent advances and to foster new research ideas, we provide a tutorial on several important emerging BML research topics in EEG/MEG signal processing and present representative examples in EEG/MEG applications.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Reconstructing brain activities from electroencephalography (EEG) plays an important role in neuroscience research and clinical treatment [2,15,34]. For example, for drug-resistant epilepsy, the epileptogenic zone can be removed through a surgical intervention. ...
... The position of each dipole is fixed, so cortical activities can be estimated by solving a linear inverse problem. Since the dipoles largely outnumber the scalp sensors, the forward equation of DCD is underdetermined [2,16,34]. To obtain a unique solution, suitable constraints are needed to narrow the solution space. ...
... Since the EEG inverse problem is highly ill-posed, suitable regularization constraints are necessary to obtain unique source solution [13,34]. The traditional L 2 -normbased methods (e.g., wMNE and LORETA) always Fig. 2 Imaging results for different extents. ...
Article
Full-text available
It is a long-standing challenge to reconstruct the locations and extents of cortical neural activities from electroencephalogram (EEG) recordings, especially when the EEG signals contain strong background activities and outlier artifacts. In this work, we propose a robust source imaging method called L1L_1R-SSSI. To alleviate the effect of outliers in EEG, L1L_1R-SSSI employs the L1L_1-loss to model the residual error. To obtain locally smooth and globally sparse estimations, L1L_1R-SSSI adopts the structured sparsity constraint, which incorporates the L1L_1-norm regularization in both the variation and original source domain. The estimations of L1L_1R-SSSI are efficiently obtained using the alternating direction method of multipliers (ADMM) algorithm. Results of simulated and experimental data analysis demonstrate that L1L_1R-SSSI effectively suppresses the effect of the outlier artifacts in EEG. L1L_1R-SSSI outperforms the traditional L2L_2-norm-based methods (e.g., wMNE, LORETA), and SISSY, which employs L2L_2-norm loss and structured sparsity, indicated by the larger AUC (average AUC >0.80>0.80), smaller SD (average SD <50<50 mm), DLE (average DLE <10<10 mm) and RMSE (average RMSE <1.75<1.75) values under all the numerically simulated conditions. L1L_1R-SSSI also provides better estimations of extended sources than the method with L1L_1-loss and LpL_p-norm regularization term (e.g., LAPPS).
... Furthermore, these properties are included in the overall analysis through the assumed EEG sources' model and various assumptions about the model. Clearly, the linear observation model [17,18], the linear dynamical model (or Kalman Filters) [17,18], and the multiple measurement vector (MMV) model [19] make different generative modelling assumptions about the underlying mechanisms that produce the EEG data. e spatial properties of EEG sources are encoded into the linear observation model through the use of prior distributions or regularization terms. ...
... Furthermore, these properties are included in the overall analysis through the assumed EEG sources' model and various assumptions about the model. Clearly, the linear observation model [17,18], the linear dynamical model (or Kalman Filters) [17,18], and the multiple measurement vector (MMV) model [19] make different generative modelling assumptions about the underlying mechanisms that produce the EEG data. e spatial properties of EEG sources are encoded into the linear observation model through the use of prior distributions or regularization terms. ...
... where N is the symbol for Gaussian distribution. In Sparse Bayesian Learning literature [18,25,26], a common approach is to assume that the covariance matrix Λ is a diagonal matrix with elements a −1 i , i � 1, . . . , 3M. ...
Article
Full-text available
We propose a new method for EEG source localization. An efficient solution to this problem requires choosing an appropriate regularization term in order to constraint the original problem. In our work, we adopt the Bayesian framework to place constraints; hence, the regularization term is closely connected to the prior distribution. More specifically, we propose a new sparse prior for the localization of EEG sources. The proposed prior distribution has sparse properties favoring focal EEG sources. In order to obtain an efficient algorithm, we use the variational Bayesian (VB) framework which provides us with a tractable iterative algorithm of closed-form equations. Additionally, we provide extensions of our method in cases where we observe group structures and spatially extended EEG sources. We have performed experiments using synthetic EEG data and real EEG data from three publicly available datasets. The real EEG data are produced due to the presentation of auditory and visual stimulus. We compare the proposed method with well-known approaches of EEG source localization and the results have shown that our method presents state-of-the-art performance, especially in cases where we expect few activated brain regions. The proposed method can effectively detect EEG sources in various circumstances. Overall, the proposed sparse prior for EEG source localization results in more accurate localization of EEG sources than state-of-the-art approaches.
... All AI-based models consist of two main components: the classifier and the extracted features. Naive Bayes (NB), a fundamental classifier in machine learning, relies on Bayesian methods, demonstrating effectiveness in handling large training datasets and offering high accuracy in scenarios involving noisy data, as commonly encountered in breast cancer models [9,10]. In another study, Texton features were combined with NB classifiers to develop a diagnostic model [11]. ...
... Eqs. (9) to (11) Cost function = (2 -(Sensitivity + Specificity) ) + min(Sensitivity -Spesificity) (12) In this equation, where the maximum values for sensitivity and specificity are both 1, the minimum and maximum values for the cost function range from 0 to 2. The first term of the cost function ensures the promotion of high values for both sensitivity and specificity, consequently resulting in an elevated accuracy level. Simultaneously, the second term acts as a stabilizing factor, preventing the function from becoming unbalanced. ...
Article
Full-text available
Breast cancer is a global health concern, ranking as the second leading cause of death among women. Current screening methods, such as mammography, face limitations, particularly for women under 50 due to radiation concerns and frequency of examination restrictions. MammoWave, utilizing microwave signals (1 to 9 GHz), emerges as an innovative and safe technology for breast cancer detection. This paper focuses on the numerical data extracted from MammoWave, presenting a hierarchical approach to address challenges posed by a diverse dataset of over 1000 samples from two European hospitals. The proposed approach involves unsupervised clustering to classify data into two main groups, followed by binary classification within each group to distinguish healthy and non-healthy cases. Careful consideration is given to feature extraction methods and classifiers at each step. The unique influence of sub-bands within the 1 to 9 GHz range on the diagnosis model is observed, leading to the selection of suitable sub-bands, feature extraction methods, and classification models. An optimization algorithm and a defined cost function are employed to achieve high and balanced sensitivity, specificity, and accuracy values. Experimental results showcase a promising overall balanced performance of around 70 %, representing a significant milestone in breast cancer detection using microwave imaging. MammoWave, with its novel approach, provides a solution that overcomes age and frequency of examination related limitations associated with existing screening methods, contributing to enhanced breast health monitoring for a broader population.
... This problem is addressed by imposing prior distributions on the model parameters and adopting a Bayesian treatment. This can be performed either through Maximum-a-Posteriori (MAP) estimation (Type-I Bayesian learning) [23]- [27] or, when the model has unknown hyperparameters, through Type-II Maximum-Likelihood estimation (Type-II Bayesian learning) [28]- [32]. In this paper, we focus on Type-II Bayesian learning, which assumes a family of prior distributions p(X|Θ) parameterized by a set of hyperparameters Θ. ...
... Summing up (32) and (34) proves inequality (30) and concludes the first part of the proof. ...
Preprint
Full-text available
We consider the reconstruction of brain activity from electroencephalography (EEG). This inverse problem can be formulated as a linear regression with independent Gaussian scale mixture priors for both the source and noise components. Crucial factors influencing accuracy of source estimation are not only the noise level but also its correlation structure, but existing approaches have not addressed estimation of noise covariance matrices with full structure. To address this shortcoming, we develop hierarchical Bayesian (type-II maximum likelihood) models for observations with latent variables for source and noise, which are estimated jointly from data. As an extension to classical sparse Bayesian learning (SBL), where across-sensor observations are assumed to be independent and identically distributed, we consider Gaussian noise with full covariance structure. Using the majorization-maximization framework and Riemannian geometry, we derive an efficient algorithm for updating the noise covariance along the manifold of positive definite matrices. We demonstrate that our algorithm has guaranteed and fast convergence and validate it in simulations and with real MEG data. Our results demonstrate that the novel framework significantly improves upon state-of-the-art techniques in the real-world scenario where the noise is indeed non-diagonal and fully-structured. Our method has applications in many domains beyond biomagnetic inverse problems.
... However, if this assumption is incorrect, the classification hypothesis becomes a mere approximation. Despite this limitation, the NB classifier has demonstrated high accuracy in its predictions, even though the evaluation of the function may be achieved with lower accuracy [42], [43], [44]. ...
Article
Full-text available
Statistics show that among the 1.67 million cancer reported cases worldwide, breast cancer is the most common cancer among women and constitutes the largest burden of the disease in developing countries. However, if detected early enough, it can be managed. Mammography is one of the best ways to identify and diagnose breast abnormalities among various medical imaging modalities. It typically detects signs and symptoms of breast cancer, including microcalcifications, lumps, nodules, architectural abnormalities, asymmetry, bilateral asymmetry, etc. These features can be benign or cancerous when they appear in the breast. Researchers have focused on creating fully automated computer-aided design methods to help radiologists combat this type of cancer. Artificial Intelligence( AI) -based algorithms have been essential in creating systems that allow for automated diagnosis, rapid response, and low mortality. In this work, several machine learning methods were compared—such as logistic regression, naive Bayesian Gaussian algorithms, support vector machines (SVM), linear support vector machines (SVM), and artificial neural networks (ANN). Processing time and accuracy were the main evaluation metrics where naive Bayes outperformed SVM, followed by linear SVM and logistic regression, with ANNs failing in accuracy. These results highlight how naive Bayes algorithms can help in early detection of breast cancer, leading to faster and more efficient treatments and ultimately better patient care.
... CART [20], C4.5 [21], C5.0 [22], and conditional tree [16], [23] are all decision tree methods. Decision trees are a powerful technique used in several disciplines, along with machine learning, image recognition, and analytical thinking [24]. DT are a collection of phases that proficiently and harmoniously unite a sequence of basic judgements in which each test matches a quantitative attribute to a predicted values [25]. ...
... The SVM algorithm aims to ng a hyperplane, which, in this context, is represented as a line. The objective is to maximize the separation width Support vectors refer to the data points that are in close proximity to the 2. The concept of maximum margin was initially put forth by Vapnik in 1963, and subsequently, the support vector machine (SVM) method was 4. Naïve Bayes: The Naïve Bayes algorithm [15] is a probabilistic form of machine learning that is utilized for classification-related applications. It is based on Bayes' theorem with the idea that predictors can be considered independent of one another. ...
Chapter
A major objective of this book series is to drive innovation in every aspect of Artificial Intelligence. It offers researchers, educators and students the opportunity to discuss and share ideas on topics, trends and developments in the fields of artificial intelligence, machine learning, deep learning and more, big data and computer science, computer intelligence and Technology. It aims to bring together experts from various disciplines to emphasize the dissemination of ongoing research in the fields of science and computing, computational intelligence, schema recognition and information retrieval. The content of the book is as follows
... Moreover, the lack of inference capability in vanilla GANs hinder insight into structural information of EEG signals. On the other hand, probabilistic graphical models(Koller and Friedman, 2009;Wu et al., 2015) enable inference through structured representations but often lack the capability to model arbitrarily complex distributions. ...
Preprint
Full-text available
A deep latent variable model is a powerful method for capturing complex distributions. These models assume that underlying structures, but unobserved, are present within the data. In this dissertation, we explore high-dimensional problems related to physiological monitoring using latent variable models. First, we present a novel deep state-space model to generate electrical waveforms of the heart using optically obtained signals as inputs. This can bring about clinical diagnoses of heart disease via simple assessment through wearable devices. Second, we present a brain signal modeling scheme that combines the strengths of probabilistic graphical models and deep adversarial learning. The structured representations can provide interpretability and encode inductive biases to reduce the data complexity of neural oscillations. The efficacy of the learned representations is further studied in epilepsy seizure detection formulated as an unsupervised learning problem. Third, we propose a framework for the joint modeling of physiological measures and behavior. Existing methods to combine multiple sources of brain data provided are limited. Direct analysis of the relationship between different types of physiological measures usually does not involve behavioral data. Our method can identify the unique and shared contributions of brain regions to behavior and can be used to discover new functions of brain regions. The success of these innovative computational methods would allow the translation of biomarker findings across species and provide insight into neurocognitive analysis in numerous biological studies and clinical diagnoses, as well as emerging consumer applications.
... For the resolution of the M/EEG inverse problem many sophisticated algorithms have been developed during the last decades considering different techniques: regularization [4], [6], [9], machine learning [7], [17] and probabilistic approaches [12], [13]. One requested algorithm's feature is the capability to take into account those prior information on the source localization that come from clinical analysis in order to a priori exclude/rank some brain regions, thus defining the so-called Region of Interest (ROI) [16]. ...
... Along this line, the L p norm iterative sparse solution (LPISS) [11] is an iterative sparse learning algorithm based on a L p norm. Sparse Bayesian learning approaches (SBL) [12]- [14] cast the inverse problem under a empirical Bayesian framework where hyperparameters can be automatically determined and sparse solutions can be obtained. In particular, Champagne [15], [16] is an SBL approach that estimates the number, location, and time course of the sources in a principled fashion. ...
Article
Full-text available
Electromagnetic source imaging (ESI) requires solving a highly ill-posed inverse problem. To seek a unique solution, traditional ESI methods impose various forms of priors that may not accurately reflect the actual source properties, which may hinder their broad applications. To overcome this limitation, in this paper a novel data-synthesized spatio-temporally convolutional encoder-decoder network method termed DST-CedNet is proposed for ESI. DST-CedNet recasts ESI as a machine learning problem, where discriminative learning and latent-space representations are integrated in a convolutional encoder-decoder network (CedNet) to learn a robust mapping from the measured electroencephalography/magnetoencephalography (E/MEG) signals to the brain activity. In particular, by incorporating prior knowledge regarding dynamical brain activities, a novel data synthesis strategy is devised to generate large-scale samples for effectively training CedNet. This stands in contrast to traditional ESI methods where the prior information is often enforced via constraints primarily aimed for mathematical convenience. Extensive numerical experiments as well as analysis of a real MEG and Epilepsy EEG dataset demonstrate that DST-CedNet outperforms several state-of-the-art ESI methods in robustly estimating source signals under a variety of source configurations.
... SBL provides a powerful framework for learning parsimonious linear latent variable models from EEG, with applications encompassing electromagnetic source imaging [41], [42], probabilistic generative modeling of EEG (e.g., oscillations and ERPs) [43], [44], and EEG decoding [11], [45]. In essence, SBL is an empirical Bayes paradigm that imposes parameterized prior on the latent variables and enforces sparsity via maximizing the marginal likelihood (also known as Type-II maximum likelihood, or evidence maximization). ...
Preprint
Decoding brain activity from non-invasive electroencephalography (EEG) is crucial for brain-computer interfaces (BCIs) and the study of brain disorders. Notably, end-to-end EEG decoding has gained widespread popularity in recent years owing to the remarkable advances in deep learning research. However, the sample sizes in many EEG studies are often too limited to prevent generic deep learning models from overfitting the highly noisy EEG data, leading to only suboptimal generalization performance. To address this fundamental limitation, this paper proposes a novel end-to-end EEG decoding algorithm in which the spatio-temporal filters and the classifier are all encoded in a low-rank weight matrix and optimized under a principled sparse Bayesian learning (SBL) framework. Importantly, this SBL framework also enables us to learn the hyperparameters that optimally penalize the model in a Bayesian fashion. The performance of the proposed decoding algorithm is systematically assessed on five motor imagery EEG datasets (N = 192) and an emotion recognition EEG dataset (N = 45), in comparison with several contemporary algorithms, including end-to-end deep learning-based EEG decoding algorithms. The classification results demonstrate that our algorithm significantly outperforms the competing algorithms while yielding neurophysiologically meaningful spatio-temporal patterns. Our algorithm therefore advances the state of the art by providing a novel EEG-tailored machine learning tool for decoding brain activity.
... NB is used to compute the probability using Bayesian theory, It provides simplest implementation and little training time with highest accuracy while computing the probabilities of noisy data. NB method includes Multinomial NB, Bernoulli NB, and Complement NB (Wu et al., 2015). ...
Preprint
Full-text available
With the emergence of the covid 19 pandemic, E-learning usage was the only way to solve the problem of study interruption in educational institutions and universities. Therefore, this field reserved significant attention in current times. In this paper, we used ten Machine Learning (ML) algorithms: Decision Tree(DT), Random Forest(RF), Logistic Regression(LR), SGD Classifier, Multinomial NB, K- Nearest Neighbors Classifier(KNN), Ridge Classifier, Nearest Centroid, Complement NB and Bernoulli NB) to build a prediction system based on artificial intelligence techniques to predict the difficulties students face in using the e-learning management system, to support related decision-making. Which, in turn, contributes supporting the sustainable development of technology at the university. From the results obtained, we detect the important factors that affect the use of E-learning to solve students' learning difficulties using LMS by building a prediction system based on AI techniques.
... Furthermore, the Bayesian statistical method in combination with a logistic regression model is an efficient approach to understanding a problem domain and predicting the outcomes of interventions. [15,16] In the present study, a logistic regression model with Bayesian supervised learning inference was employed to elucidate quantitative effects of 1-to 6-comorbidity risk factors for dementia, namely depression, vascular disease, severe head injury, hearing loss, DM, and senile cataract, which were identified from a nationwide longitudinal population-based database. ...
Article
Full-text available
Dementia is one of the most burdensome illnesses in elderly populations worldwide. However, the literature about multiple risk factors for dementia is scant. To develop a simple, rapid, and appropriate predictive tool for the clinical quantitative assessment of multiple risk factors for dementia. A population-based cohort study. Based on the Taiwan National Health Insurance Research Database, participants first diagnosed with dementia from 2000 to 2009 and aged ≥65 years in 2000 were included. A logistic regression model with Bayesian supervised learning inference was implemented to evaluate the quantitative effects of 1-to 6-comorbidity risk factors for dementia in the elderly Taiwanese population: depression, vascular disease, severe head injury, hearing loss, diabetes mellitus (DM), and senile cataract, identified from a nationwide longitudinal population-based database. This study enrolled 4749 (9.5%) patients first diagnosed as having dementia. Aged, female, urban residence, and low income were found as independent sociodemographic risk factors for dementia. Among all odds ratios (ORs) of 2-comorbidity risk factors for dementia, comorbid depression and vascular disease had the highest adjusted OR of 6.726. The 5-comorbidity risk factors, namely depression, vascular disease, severe head injury, hearing loss, and DM, exhibited the highest OR of 8.767. Overall, the quantitative effects of 2 to 6 comorbidities and age difference on dementia gradually increased; hence, their ORs were less than additive. These results indicate that depression is a key comorbidity risk factor for dementia. The present findings suggest that physicians should pay more attention to the role of depression in dementia development. Depression is a key cormorbidity risk factor for dementia. It is the urgency of evaluating the nature of the link between depression and dementia; and further testing what extent controlling depression could effectively lead to the prevention of dementia. Abbreviations: ADVI = automatic differentiation variational inference, DM = diabetes mellitus, ICD-9-CM = International Classification of Disease, Ninth Revision, Clinical Modification, NHIRD = National Health Insurance Research Database, OR = odds ratio.
... Over last decades, the Bayesian compressive sensing (BCS) framework, which is originated from sparse Bayesian learning (SBL), has become an active sub-class of sparse signal reconstruction algorithms [1-7] and has been widely applied in many fields, such as array synthesis [8], directions-of-arrival (DOA) estimation [9], radar localisation and imaging [10][11][12], and electrocardiogram, electroencephalography (EEG), and magnetoencephalography (MEG) signal processing [13,14]. ...
Article
Full-text available
Bayesian compressive sensing (BCS) is an important sub‐class of sparse signal reconstruction algorithms. In this paper, a modified complex multitask Bayesian compressive sensing (MCMBCS) algorithm using the Laplacian scale mixture (LSM) prior is proposed. The LSM prior is first introduced into the complex BCS framework by exploiting its better sparse characteristic and flexibility than traditional Laplacian prior. Furthermore, by integrating out the noise variance analytically, the MCMBCS algorithm significantly improves the signal recovery performance than the original CMBCS. More importantly, the authors not only present the iterative algorithm but also develop the sub‐optimal fast implementation method based on the marginal likelihood maximisation, which dramatically reduce the computational complexity. Finally, sufficient numerical simulations validate the better performance of the proposed algorithm in reconstruction accuracy and computational effectiveness than existing work. It is revealed that the proposed algorithm has great potential in the complex‐valued signal processing field.
... NB is used to compute the probability using Bayesian theory, It provides simplest implementation and little training time with highest accuracy while computing the probabilities of noisy data. NB method includes Multinomial NB, Bernoulli NB, and Complement NB (Wu et al., 2015). ...
Preprint
Full-text available
With the emergence of the covid 19 pandemic, E-learning usage was the only way to solve the problem of study interruption in educational institutions and universities. Therefore, this field has garnered significant attention in recent times. In this paper, we used ten machine-learning algorithms (Logistic Regression, Decision Tree, Random Forest, SGD Classifier, Multinomial NB, K-Neighbors Classifier, Ridge Classifier, Nearest Centroid, Complement NB and Bernoulli NB) to build a prediction system based on artificial intelligence techniques to predict the difficulties students face in using the e-learning management system, and support related decision-making. Which, in turn, contributes to supporting the sustainable development of technology at the university. From the results obtained, we found the important factors that affect the use of E-learning to solve students' learning difficulties by using LMS.
... • Naïve Bayes: Naïve Bayes [95] classifiers are simple probabilistic classifiers. They are based on the following assumption: all the input features are independent of each other and no correlation exists between them. ...
Thesis
Video content now occupies about 82% of global internet traffic. This large percentage is due to the revolution in video content consumption. On the other hand, the market is increasingly demanding videos with higher resolutions and qualities. This causes a significant increase in the amount of data to be transmitted. Hence the need to develop video coding algorithms even more efficient than existing ones to limit the increase in the rate of data transmission and ensure a better quality of service. In addition, the impressive consumption of multimedia content in electronic products has an ecological impact. Therefore, finding a compromise between the complexity of algorithms and the efficiency of implementations is a new challenge. As a result, a collaborative team was created with the aim of developing a new video coding standard, Versatile Video Coding – VVC/H.266. Although VVC was able to achieve a more than 40% reduction in throughput compared to HEVC, this does not mean at all that there is no longer a need to further improve coding efficiency. In addition, VVC adds remarkable complexity compared to HEVC. This thesis responds to these problems by proposing three new encoding methods. The contributions of this research are divided into two main axes. The first axis is to propose and implement new compression tools in the new standard, capable of generating additional coding gains. Two methods have been proposed for this first axis. These two methods rely on the derivation of prediction information at the decoder side. This is because increasing encoder choices can improve the accuracy of predictions and yield less energy residue, leading to a reduction in bit rate. Nevertheless, more prediction modes involve more signaling to be sent into the binary stream to inform the decoder of the choices that have been made at the encoder. The gains mentioned above are therefore more than offset by the added signaling. If the prediction information has been derived from the decoder, the latter is no longer passive, but becomes active hence the concept of intelligent decoder. Thus, it will be useless to signal the information, hence a gain in signalization. Each of the two methods offers a different intelligent technique than the other to predict information at the decoder level. The first technique constructs a histogram of gradients to deduce different intra-prediction modes that can then be combined by means of prediction fusion, to obtain the final intra-prediction for a given block. This fusion property makes it possible to more accurately predict areas with complex textures, which, in conventional coding schemes, would rather require partitioning and/or finer transmission of high-energy residues. The second technique gives VVC the ability to switch between different interpolation filters of the inter prediction. The deduction of the optimal filter selected by the encoder is achieved through convolutional neural networks. The second axis, unlike the first, does not seek to add a contribution to the VVC algorithm. This axis rather aims to build an optimized use of the already existing algorithm. The ultimate goal is to find the best possible compromise between the compression efficiency delivered and the complexity imposed by VVC tools. Thus, an optimization system is designed to determine an effective technique for activating the new coding tools. The determination of these tools can be done either using artificial neural networks or without any artificial intelligence technique.
... Detection of spatiotemporal features of esophageal abnormality from endoscopic videos by incorporating 3D convolutional neural network and convolutional long shortterm memories (LSTM) reported in [38] for the first time. Bayesian machine learning (BML) was discussed as a method to extract the electroencephalography (EEG) and magnetoencephalography (MEG) informative brain spatiotemporal-spectral patterns [39]. ...
Article
Full-text available
A low-cost machine learning (ML) algorithm is proposed and discussed for spatial tracking of unknown, correlated signals in localized, ad-hoc wireless sensor networks. Each sensor is modeled as one neuron and a selected subset of these neurons are called to identify the spatial signal. The algorithm is implemented in two phases of spatial modeling and spatial tracking. The spatial signal is modeled using its M iso-contour lines at levels {ℓj}j=1M and those sensors that their sensor observations are in Δ margin of any of these levels report their sensor observations to the fusion center (FC) for spatial signal reconstruction. In spatial modeling phase, the number of these contour lines, their levels and a proper Δ are identified. In this phase, the algorithm may either use adaptive-weight stochastic gradient or scaled stochastic gradient method to select a proper Δ. Additive white Gaussian noise (AWGN) with zero mean is assumed along with the sensor observations. To reduce the observation noise’s effect, each sensor applies moving average filter on its observation to drastically reduce the effect of noise. The modeling performance, the cost and the convergence of the algorithm are discussed based on extensive computer simulations and reasoning. The algorithm is proposed for climate and environmental monitoring. In this paper, the percentage of wireless sensors that initiate a communication attempt is assumed as cost. The performance evaluation results show that the proposed spatial tracking approach is low-cost and can model the spatial signal over time with the same performance as that of spatial modeling.
... Concrete instantiations of this approach have further been introduced under the names sparse Bayesian learning (SBL) (Tipping, 2001) or automatic relevance determination (ARD) (Tipping, 2000), kernel Fisher discriminant (KFD) (Mika et al., 2001), variational Bayes (VB) (Seeger and Wipf, 2010;Wipf and Nagarajan, 2009) and iteratively-reweighted MAP estimation (Gorodnitsky et al., 1995;. Interested readers are referred to (Wu et al., 2016) for a comprehensive survey on Bayesian machine learning techniques for EEG/MEG signals. To distinguish all these Type-II variants from classical ML and MAP approaches not involving hyperparameter learning, the latter are also referred to as Type-I approaches. ...
Article
Full-text available
Methods for electro- or magnetoencephalography (EEG/MEG) based brain source imaging (BSI) using sparse Bayesian learning (SBL) have been demonstrated to achieve excellent performance in situations with low numbers of distinct active sources, such as event-related designs. This paper extends the theory and practice of SBL in three important ways. First, we reformulate three existing SBL algorithms under the majorization-minimization (MM) framework. This unification perspective not only provides a useful theoretical framework for comparing different algorithms in terms of their convergence behavior, but also provides a principled recipe for constructing novel algorithms with specific properties by designing appropriate bounds of the Bayesian marginal likelihood function. Second, building on the MM principle, we propose a novel method called LowSNR-BSI that achieves favorable source reconstruction performance in low signal-to-noise-ratio (SNR) settings. Third, precise knowledge of the noise level is a crucial requirement for accurate source reconstruction. Here we present a novel principled technique to accurately learn the noise variance from the data either jointly within the source reconstruction procedure or using one of two proposed cross-validation strategies. Empirically, we could show that the monotonous convergence behavior predicted from MM theory is confirmed in numerical experiments. Using simulations, we further demonstrate the advantage of LowSNR-BSI over conventional SBL in low-SNR regimes, and the advantage of learned noise levels over estimates derived from baseline data. To demonstrate the usefulness of our novel approach, we show neurophysiologically plausible source reconstructions on averaged auditory evoked potential data.
... Detection of spatiotemporal features of esophageal abnormality from endoscopic 108 videos by incorporating 3D convolutional neural network and convolutional long short-109 term memories (LSTM) reported in [38] for the first time. Bayesian machine learn-110 ing (BML) was discussed as a method to extract the electroencephalography (EEG) 111 and magneto-encephalography (MEG) informative brain spatiotemporal-spectral pat-112 terns [39]. ...
Preprint
A low-cost machine learning (ML) algorithm is proposed and discussed for spatial tracking of unknown, correlated signals in localized, ad-hoc wireless sensor networks. Each sensor is modeled as one neuron and a selected subset of these neurons are called to identify the spatial signal. The algorithm is implemented in two phases of spatial modeling and spatial tracking. The spatial signal is modeled using its M iso-contour lines at levels {ℓj}j=1M and those sensors that their sensor observations are in Δ margin of any of these levels report their sensor observations to the fusion center (FC) for spatial signal reconstruction. In spatial modeling phase, the number of these contour lines, their levels and a proper Δ are identified. In this phase, the algorithm may either use adaptive-weight stochastic gradient or scaled stochastic gradient method to select a proper Δ. Additive white Gaussian noise (AWGN) with zero mean is assumed along with the sensor observations. To reduce the observation noise’s effect, each sensor applies moving average filter on its observation to drastically reduce the effect of noise. The modeling performance, the cost and the convergence of the algorithm are discussed based on extensive computer simulations and reasoning. The algorithm is proposed for environmental monitoring. In this paper, the percentage of the communication attempts of wireless sensors is assumed as cost. Performance evaluation results show that the proposed spatial tracking approach is low cost and can model the spatial signal over time with the same performance as that of spatial modeling.
... Various methods have been used to extract features from EEG signals. Popular methods are entropy [14], detrended moving average (DMA) [15], isomap-based estimation [16], Bayesian [17], and others [18]. In the past decade, entropy algorithms have been widely used for features extraction in anaesthetic EEG signals. ...
Article
Full-text available
Anaesthesia is a state of temporary controlled loss of awareness induced for medical operations. An accurate assessment of the depth of anaesthesia (DoA) helps anesthesiologists to avoid awareness during surgery and keep the recovery period short. However, the existing DoA algorithms have limitations, such as not robust enough for different patients and having time delay in assessment. In this study, to develop a reliable DoA measurement method, pre-denoised electroencephalograph (EEG) signals are divided into ten frequency bands ( α, β1, β2, β3, β4, β, βγ, γ, δ and θ ), and the features are extracted from different frequency bands using spectral entropy (SE) methods. SE from the beta-gamma frequency band (21.5–38.5 Hz) and SE from the beta frequency band show the highest correlation (R-squared value: 0.8458 and 0.7312, respectively) with the most popular DoA index, bispectral index (BIS). In this research, a new DoA index is developed based on these two SE features for monitoring the DoA. The highest Pearson correlation coefficient by comparing the BIS index for testing data is 0.918, and the average is 0.80. In addition, the proposed index shows an earlier reaction than the BIS index when the patient goes from deep anaesthesia to moderate anaesthesia, which means it is more suitable for the real-time DoA assessment. In the case of poor signal quality (SQ), while the BIS index exhibits inflexibility with cases of poor SQ, the new proposed index shows reliable assessment results that reflect the clinical observations.
... The emerging intelligent applications and high-performance systems require more complexity and demand sensory units to accurately describe the physical object. The decision-making unit or algorithm can therefore output a more reliable result (Khezri and Jahed, 2007;Wu et al., 2016;He et al., 2017;Liang et al., 2018Liang et al., , 2019. Depending on the signal acquiring position, Figure 1 illustrates four biopotential sensors and two widely used wearable sensors along with their learning systems and applications, which have also been summarized in Table 1. ...
Article
Full-text available
Wearable devices are a fast-growing technology with impact on personal healthcare for both society and economy. Due to the widespread of sensors in pervasive and distributed networks, power consumption, processing speed, and system adaptation are vital in future smart wearable devices. The visioning and forecasting of how to bring computation to the edge in smart sensors have already begun, with an aspiration to provide adaptive extreme edge computing. Here, we provide a holistic view of hardware and theoretical solutions toward smart wearable devices that can provide guidance to research in this pervasive computing era. We propose various solutions for biologically plausible models for continual learning in neuromorphic computing technologies for wearable sensors. To envision this concept, we provide a systematic outline in which prospective low power and low latency scenarios of wearable sensors in neuromorphic platforms are expected. We successively describe vital potential landscapes of neuromorphic processors exploiting complementary metal-oxide semiconductors (CMOS) and emerging memory technologies (e.g., memristive devices). Furthermore, we evaluate the requirements for edge computing within wearable devices in terms of footprint, power consumption, latency, and data size. We additionally investigate the challenges beyond neuromorphic computing hardware, algorithms and devices that could impede enhancement of adaptive edge computing in smart wearable devices.
... To date, machine learning methods have generated growing interests in medicine for its promising applications in medical diagnosis, EEG and medical imaging analysis, and mental health (Wu et al., 2016;Rajkomar et al. 2019;Chambon et al., 2016). Deep learning (or deep neural network) approaches have demonstrated extraordinary regression or classification performance as a result of the increased computing power and the mining of a large number of samples (LeCun et al., 2015;Schmidhuber, 2015). ...
Article
Objective: Automatic detection of interictal epileptiform discharges (IEDs, short as ``spikes'') from an epileptic brain can help predict seizure recurrence and support the diagnosis of epilepsy. Developing fast, reliable and robust detection methods for IEDs based on scalp or intracortical EEG may facilitate online seizure monitoring and closed-loop neurostimulation. Approach: We developed a new deep learning approach, which employs a long short-term memory (LSTM) network architecture (``IEDnet'') and an auxiliary classifier generative adversarial network (AC-GAN), to train on both expert-annotated and augmented spike events from intracranial electroencephalography (iEEG) recordings of epilepsy patients. We validated our IEDnet with two real-world iEEG datasets, and compared IEDnet with the support vector machine (SVM) and random forest (RF) classifiers on their detection performances. Main results: IEDnet achieved excellent cross-validated detection performances in terms of both sensitivity and specificity, and outperformed SVM and RF. Synthetic spike samples augmented by AC-GAN further improved the detection performance. In addition, the performance of IEDnet was robust with respect to the sampling frequency and noise. Furthermore, we also demonstrated the cross-institutional generalization ability of IEDnet while testing between two datasets. Significance: IEDnet achieves excellent detection performances in identifying interictal spikes. AC-GAN can produce augmented iEEG samples to improve supervised deep learning.
... Therefore, the art demand preference analysis system proposed in this paper can detect the desire of consumers in major cities to eat various artworks. With the support of big data background, the machine learning method [9][10] is used to monitor the search volume of different artworks by users, and then statistical analysis is conducted on the preferences of users in various provinces and cities for artworks. Search in a large number of search keywords to form a memory, will set good retrieval requirements, real-time monitoring of user search, establish an adaptive search model with user synchronous changes, and predict the demand of residents for art in this area. ...
Article
Full-text available
In less than 30 years, the Chinese art market has completed a gorgeous transformation from traditional to modern. In the process of development, great changes have taken place in the form of China’s art market. The scale of art transaction is gradually expanding, and the price of Chinese art market is also getting higher and higher. In order to better achieve efficient retrieval under the background of big data and know the demand of contemporary art market, this paper studies a kind of art demand preference analysis system based on big data. Based on the big data environment, using machine learning method, more accurate detection of the general public’s interest in art. By extracting user preferences from the big data of users’ search records, a search model with synchronous changes with user preferences is established by using machine learning method, so as to predict users’ interest preference for artworks in advance. The system is applied to Baidu, Jingdong and other platforms, and a questionnaire survey is carried out. The results show that the detected data are similar to the survey results.
... Nonparametric Bayesian approaches like Gaussian process (GP) is closely related to ANN. GP has recently been used for system identification purpose [88][89][90] and applied to analyse neurophysiological signals [91], such as the use of GP modelling for EEG-based seizure detection and prediction [92] and heteroscedastic modelling of noisy highdimensional MEG data [93]. Compared with ANN, GP can be applied to model datasets with small sample size and it has a relatively small number of hyperparameters. ...
Article
Full-text available
The human nervous system is one of the most complicated systems in nature. Complex nonlinear behaviours have been shown from the single neuron level to the system level. For decades, linear connectivity analysis methods, such as correlation, coherence and Granger causality, have been extensively used to assess the neural connectivities and input-output interconnections in neural systems. Recent studies indicate that these linear methods can only capture a certain amount of neural activities and functional relationships, and therefore cannot describe neural behaviours in a precise or complete way. In this review, we highlight recent advances in nonlinear system identification of neural systems, corresponding time and frequency domain analysis, and novel neural connectivity measures based on nonlinear system identification techniques. We argue that nonlinear modelling and analysis are necessary to study neuronal processing and signal transfer in neural systems quantitatively. These approaches can hopefully provide new insights to advance our understanding of neurophysiological mechanisms underlying neural functions. These nonlinear approaches also have the potential to produce sensitive biomarkers to facilitate the development of precision diagnostic tools for evaluating neurological disorders and the effects of targeted intervention.
... However, implementing the cross-validation requires additional samples for performance validation and is generally time-consuming, which limits the practicability of BCI systems, to some extent. Bayesian inference provides an elegant way to circumvent this issue by exploiting a properly designed prior distribution [56], [57]. As a typical method, sparse Bayesian learning-based algorithms [18], [58], [59], [60] have been developed for automatic optimization of model hyperparameters, by exploiting a sparsity-induced prior, such as automatic relevance determination or Laplace distribution. ...
Preprint
Accurate electroencephalogram (EEG) pattern decoding for specific mental tasks is one of the key steps for the development of brain-computer interface (BCI), which is quite challenging due to the considerably low signal-to-noise ratio of EEG collected at the brain scalp. Machine learning provides a promising technique to optimize EEG patterns toward better decoding accuracy. However, existing algorithms do not effectively explore the underlying data structure capturing the true EEG sample distribution, and hence can only yield a suboptimal decoding accuracy. To uncover the intrinsic distribution structure of EEG data, we propose a clustering-based multi-task feature learning algorithm for improved EEG pattern decoding. Specifically, we perform affinity propagation-based clustering to explore the subclasses (i.e., clusters) in each of the original classes, and then assign each subclass a unique label based on a one-versus-all encoding strategy. With the encoded label matrix, we devise a novel multi-task learning algorithm by exploiting the subclass relationship to jointly optimize the EEG pattern features from the uncovered subclasses. We then train a linear support vector machine with the optimized features for EEG pattern decoding. Extensive experimental studies are conducted on three EEG datasets to validate the effectiveness of our algorithm in comparison with other state-of-the-art approaches. The improved experimental results demonstrate the outstanding superiority of our algorithm, suggesting its prominent performance for EEG pattern decoding in BCI applications.
... With the recent rapid advancement in machine learning (ML)-and deep learning (DL)-driven data science technologies, emerging new informatics tools are effective and efficient for MEG data processing and mining. For example, Bayesian inference was used in MEG signal processing and brain activity prediction [13], and supervised learning techniques have been incorporated into downstream data mining for MEG in neuropsychiatric and neurodegenerative disorders, such as Huntington's diseases, mild traumatic injury and bipolar disorders, yielding promising results [14][15][16]. We have shown the utility of a MLbased data mining pipeline in PTSD classification using MEG connectome data [17]. ...
Article
Objective The present study explores the effectiveness of incorporating temporal information in predicting Post-Traumatic Stress Disorder (PTSD) severity using magnetoencephalography (MEG) imaging data. The main objective was to assess the relationship between longitudinal MEG functional connectome data, measured across a variety of neural oscillatory frequencies and collected at two-timepoints (Phase I & II), against PTSD severity captured at the later time point. Approach We used an in-house developed informatics solution, featuring a two-step process featuring pre-learn feature selection (CV-SVR-rRF-FS, cross-validation with support vector regression and recursive random forest feature selection) and deep learning (long-short term memory recurrent neural network, LSTM-RNN) techniques. Main results The pre-learn step selected a small number of functional connections (or edges) from Phase I MEG data associated with Phase II PTSD severity, indexed using the PTSD CheckList (PCL) score. This strategy identified the functional edges affected by traumatic exposure and indexed disease severity, either permanently or evolving dynamically over time, for optimal predictive performance. Using the selected functional edges, LSTM modelling was used to incorporate the Phase II MEG data into longitudinal regression models. Single timepoint (Phase I and Phase II MEG data) SVR models were generated for comparison. Assessed with holdout test data, alpha and high gamma bands showed enhanced predictive performance with the longitudinal models comparing to the Phase I single timepoint models. The best predictive performance was observed for lower frequency ranges compared to the higher frequencies (low gamma), for both model types. Significance This study identified the neural oscillatory signatures that benefited from additional temporal information when estimating the outcome of PTSD severity using MEG functional connectome data. Crucially, this approach can similarly be applied to any other mental health challenge, using this effective informatics foundation for longitudinal tracking of pathological brain states and predicting outcome with a MEG-based neurophysiology imaging system.
... Along this line, the L p norm iterative sparse solution (LPISS) [11] is an iterative sparse learning algorithm based on a L p norm. Sparse Bayesian learning approaches (SBL) [12]- [14] cast the inverse problem under a empirical Bayesian framework where hyperparameters can be automatically determined and sparse solutions can be obtained. In particular, Champagne [15], [16] is an SBL approach that estimates the number, location, and time course of the sources in a principled fashion. ...
Preprint
Full-text available
Electromagnetic source imaging (ESI) is a highly ill-posed inverse problem. To find a unique solution, traditional ESI methods impose a variety of priors that may not reflect the actual source properties. Such limitations of traditional ESI methods hinder their further applications. Inspired by deep learning approaches, a novel data-synthesized spatio-temporal denoising autoencoder method (DST-DAE) method was proposed to solve the ESI inverse problem. Unlike the traditional methods, we utilize a neural network to directly seek generalized mapping from the measured E/MEG signals to the cortical sources. A novel data synthesis strategy is employed by introducing the prior information of sources to the generated large-scale samples using the forward model of ESI. All the generated data are used to drive the neural network to automatically learn inverse mapping. To achieve better estimation performance, a denoising autoen-coder (DAE) architecture with spatio-temporal feature extraction blocks is designed. Compared with the traditional methods, we show (1) that the novel deep learning approach provides an effective and easy-to-apply way to solve the ESI problem, that (2) compared to traditional methods, DST-DAE with the data synthesis strategy can better consider the characteristics of real sources than the mathematical formulation of prior assumptions, and that (3) the specifically designed architecture of DAE can not only provide a better estimation of source signals but also be robust to noise pollution. Extensive numerical experiments show that the proposed method is superior to the traditional knowledge-driven ESI methods.
... The rapid advancement of ML-based data mining approaches has shown promise in neuroradiology for the assessment of large, multidimensional data sets. For example, various Bayesian inference-based ML algorithms have been developed for neuroimaging signal processing, and supervised learning methods have been used as informatics tools for data mining (Wu et al. 2016). In the context of translational research and clinical applications, these methods are actively explored for diagnosis, prognostication, and intervention efficacy. ...
Article
Objective: Mild traumatic brain injury (mTBI) is impossible to detect using standard neuroradiological assessment such as structural magnetic resonance imaging (MRI). Injury does however disrupt the dynamic repertoire of neural activity indexed by neural oscillations. In particular, beta oscillations are reliable predictors of cognitive, perceptual and motor system functioning, as well as correlate highly with underlying myelin architecture and brain connectivity - all factors particularly susceptible to dysregulation after mTBI. Methods: We measured local and large-scale neural circuit function using MEG (magnetoencephalography) with a data-driven model fit approach using the Fitting Oscillations & One-Over F algorithm, in a group of young adult males with mTBI and a matched healthy control group. We quantified band-limited regional power and functional connectivity between brain regions. Results: We found reduced regional power and deficits in functional connectivity across brain areas, which pointed to the well-characterized thalamocortical dysconnectivity associated with mTBI. Furthermore, our results suggested beta functional connectivity data reached the best mTBI classification performance when compared with regional power and symptom severity (measured using SCAT2, or Sport Concussion Assessment Tool 2). Conclusions: The current study revealed the relevance of beta oscillations as a window into neurophysiological dysfunction in mTBI, and also highlights the reliability of neural synchrony biomarkers in disorder classification.
... The most common neuroimaging modality that is employed in BCIs is the electroencephalography, a typically non-invasive neuroimaging technology that measures the brain's electrical activity using electrodes placed on the human scalp. The produced recording, called electroencephalogram (EEG), is not easy to interpret as it has a low signal to noise ratio and its statistical properties change substantially with the course of time [1]. Moreover, EEG is known to vary significantly across individuals and even to depend on subject's state during the recording. ...
Article
Full-text available
Electroencephalography signals inherently deviate from the notion of regular spatial sampling, as they reflect the coordinated action from multiple distributed overlapping cortical networks. Hence, the observed brain dynamics are influenced both by the topology of the sensor array and the underlying functional connectivity. Neural engineers are currently exploiting the advances in the domain of graph signal processing in an attempt to create robust and reliable brain decoding systems. In this direction, Geometric Deep Learning is a highly promising concept for combining the benefits of graph signal processing and deep learning towards revolutionizing Brain-Computer Interfaces (BCIs). However, its exploitation has been hindered by its data-demanding character. As a remedy, we propose here a novel data augmentation approach that combines the multiplex network modelling of multichannel signal with a graph variant of the classical Empirical Mode Decomposition (EMD), and which proves to be a strong asset when combined with Graph Convolutional Neural Networks (GCNNs). As our graph-EMD algorithm makes no assumptions with respect to linearity and stationarity, it appears as an appealing solution towards analyzing brain signals without artificially imposing regularities in either temporal or spatial domain. Our experimental results indicate that the proposed scheme for data augmentation leads to substantial improvement when it is combined with GCNNs. Using recordings from two distinct BCI applications and comparing against a state-of-the-art augmentation method, we illustrate the benefits from its use. By making it available to BCI community, we hope to further foster the application of geometric deep learning in the field.
Article
Full-text available
Breast Cancer, with an expected 42,780 deaths in the US alone in 2024, is one of the most prevalent types of cancer. The death toll due to breast cancer would be very high if it were to be totaled up globally. Early detection of breast cancer is the only way to decrease the mortality caused by it. In order to diagnose breast cancer, even the most competent and qualified pathologists and radiologists have to examine hundreds of high-resolution images, which is a massive burden on them. Compared to the number of cases, very few experts are available to manage this burden. Additionally, as humans are more prone to mistakes, the likelihood of finding false positive cases is also high. Numerous AI techniques, including machine learning and deep learning, are ideally suited to address these issues, inspiring many researchers to introduce novel computer-aided detection systems. In this study, we have comprehensively reviewed pre-existing literature aimed at developing computer-aided systems based on using machine learning, deep learning, and vision transformers to identify and classify breast cancer. We have discussed numerous imaging modalities for detecting breast cancer, along with the widely used data pre-processing approaches, machine learning and deep learning models, as well as ensemble learning methods suitable for the task. Popular datasets and their sources are also listed for future referencing. Finally, we have identified a few gaps and addressed potential future research directions with an intent of aiding researchers select approaches tailored to case-specific needs.
Article
Full-text available
The main purpose of this paper is to provide information on how to create a convolutional neural network (CNN) for extracting features from EEG signals. Our task was to understand the primary aspects of creating and fine-tuning CNNs for various application scenarios. We considered the characteristics of EEG signals, coupled with an exploration of various signal processing and data preparation techniques. These techniques include noise reduction, filtering, encoding, decoding, and dimension reduction, among others. In addition, we conduct an in-depth analysis of well-known CNN architectures, categorizing them into four distinct groups: standard implementation, recurrent convolutional, decoder architecture, and combined architecture. This paper further offers a comprehensive evaluation of these architectures, covering accuracy metrics, hyperparameters, and an appendix that contains a table outlining the parameters of commonly used CNN architectures for feature extraction from EEG signals.
Article
Full-text available
Decoding brain activity from non-invasive electroencephalography (EEG) is crucial for brain-computer interfaces (BCIs) and the study of brain disorders. Notably, end-to-end EEG decoding has gained widespread popularity in recent years owing to the remarkable advances in deep learning research. However, many EEG studies suffer from limited sample sizes, making it difficult for existing deep learning models to effectively generalize to highly noisy EEG data. To address this fundamental limitation, this paper proposes a novel end-to-end EEG decoding algorithm that utilizes a low-rank weight matrix to encode both spatio-temporal filters and the classifier, all optimized under a principled sparse Bayesian learning (SBL) framework. Importantly, this SBL framework also enables us to learn hyperparameters that optimally penalize the model in a Bayesian fashion. The proposed decoding algorithm is systematically benchmarked on five motor imagery BCI EEG datasets ( N=192 ) and an emotion recognition EEG dataset ( N=45 ), in comparison with several contemporary algorithms, including end-to-end deep-learning-based EEG decoding algorithms. The classification results demonstrate that our algorithm significantly outperforms the competing algorithms while yielding neurophysiologically meaningful spatio-temporal patterns. Our algorithm therefore advances the state-of-the-art by providing a novel EEG-tailored machine learning tool for decoding brain activity.Code is available at https://github.com/EEGdecoding/Code-SBLEST .
Article
The introduction of the Internet of Things has led to the connectivity of millions of devices with less human interaction. This demand in connectivity has resulted in a surge in network attacks as IoT is susceptible to several cyberattacks. Due to their resource-constrained nature, traditional security mechanisms are inappropriate for securing IoT systems. Hence, the need for pervasive security mechanisms that are robust to mitigate attacks and secure IoT networks. One of the emerging potential solutions to network security is Machine Learning (ML). Recently, ML has been applied to mitigate cybersecurity threats in Cyber-Physical Systems (CPS). This paper presents a hybrid ML model for the efficient and effective detection of anomalies in IoT systems. The proposed model combines Random Forest algorithm, XGB, KNN and two decision tress with equal weights assigned to enhance the detection of anomalies in IoT systems. Experimental results show that the proposed hybrid model achieves a higher misbehaviour detection rate when compared to the other ML models in terms of accuracy, precision, recall and f1-score
Article
Full-text available
Accurate reconstruction of the brain activities from electroencephalography and magnetoencephalography (E/MEG) remains a long-standing challenge for the intrinsic ill-posedness in the inverse problem. In this study, to address this issue, we propose a novel data-driven source imaging framework based on sparse Bayesian learning and deep neural network (SI-SBLNN). Within this framework, the variational inference in conventional algorithm, which is built upon sparse Bayesian learning, is compressed via constructing a straightforward mapping from measurements to latent sparseness encoding parameters using deep neural network. The network is trained with synthesized data derived from the probabilistic graphical model embedded in the conventional algorithm. We achieved a realization of this framework with the algorithm, source imaging based on spatio-temporal basis function (SI-STBF), as backbone. In numerical simulations, the proposed algorithm validated its availability for different head models and robustness against distinct intensities of the noise. Meanwhile, it acquired superior performance compared to SI-STBF and several benchmarks in a variety of source configurations. Additionally, in real data experiments, it obtained the concordant results with the prior studies.
Chapter
Full-text available
With the emergence of the covid 19 pandemic, E-learning usage was the only way to solve the problem of study interruption in educational institutions and universities. Therefore, this field reserved significant attention in current times. In this paper, we used ten Machine Learning (ML) algorithms: Decision Tree(DT), Random Forest(RF), Logistic Regression(LR), SGD Classifier, Multinomial NB, K- Nearest Neighbors Classifier(KNN), Ridge Classifier, Nearest Centroid, Complement NB and Bernoulli NB) to build a prediction system based on artificial intelligence techniques to predict the difficulties students face in using the e-learning management system, to support related decision-making. Which, in turn, contributes supporting the sustainable development of technology at the university. From the results obtained, we detect the important factors that affect the use of E-learning to solve students’ learning difficulties using LMS by building a prediction system based on AI techniques.KeywordsMachine learningE-learningStudent’s performanceEducational data mining
Article
We consider the reconstruction of brain activity from electroencephalography (EEG). This inverse problem can be formulated as a linear regression with independent Gaussian scale mixture priors for both the source and noise components. Crucial factors influencing the accuracy of the source estimation are not only the noise level but also its correlation structure, but existing approaches have not addressed the estimation of noise covariance matrices with full structure. To address this shortcoming, we develop hierarchical Bayesian (type-II maximum likelihood) models for observations with latent variables for source and noise, which are estimated jointly from data. As an extension to classical sparse Bayesian learning (SBL), where across-sensor observations are assumed to be independent and identically distributed, we consider Gaussian noise with full covariance structure. Using the majorization-maximization framework and Riemannian geometry, we derive an efficient algorithm for updating the noise covariance along the manifold of positive definite matrices. We demonstrate that our algorithm has guaranteed and fast convergence and validate it in simulations and with real MEG data. Our results demonstrate that the novel framework significantly improves upon state-of-the-art techniques in the real-world scenario where the noise is indeed non-diagonal and fullstructured. Our method has applications in many domains beyond biomagnetic inverse problems.
Article
Simultaneously estimating brain source activity and noise has long been a challenging task in electromagnetic brain imaging using magneto- and electroencephalography. The problem is challenging not only in terms of solving the NP-hard inverse problem of reconstructing unknown brain activity across thousands of voxels from a limited number of sensors, but also for the need to simultaneously estimate the noise and interference. We present a generative model with an augmented leadfield matrix to simultaneously estimate brain source activity and sensor noise statistics in electromagnetic brain imaging (EBI). We then derive three Bayesian inference algorithms for this generative model (expectation-maximization (EBI-EM), convex bounding (EBI-Convex) and fixed-point (EBI-Mackay)) to simultaneously estimate the hyperparameters of the prior distribution for brain source activity and sensor noise. A comprehensive performance evaluation for these three algorithms is performed. Simulations consistently show that the performance of EBI-Convex and EBI-Mackay updates is superior to that of EBI-EM. In contrast to the EBI-EM algorithm, both EBI-Convex and EBI-Mackay updates are quite robust to initialization, and are computationally efficient with fast convergence in the presence of both Gaussian and real brain noise. We also demonstrate that EBI-Convex and EBI-Mackay update algorithms can reconstruct complex brain activity with only a few trials of sensor data, and for resting-state data, achieving significant improvement in source reconstruction and noise learning for electromagnetic brain imaging.
Article
The communication between the human brain and the external devices can be established using Electroencephalograms (EEG)-based Brain–Computer Interface by converting the neural activities of the brain into electric signals. The EEG signals were isolated into an energy–frequency–time spectrum with Hilbert Huang transform that was used by the Deep Learning (DL)-based model to learn discriminative spectro-temporal patterns of the raw EEG signals of ten digits. This paper has two major contributions: first, create a novel dataset known as BrainDigiData of EEG signals of ten digits from (0–9) using a multi-channel EEG device. Second to propose a DL-based one-dimensional Convolutional neural network model BrainDigiCNN to classify the BrainDigiData of EEG signals of digits. The publicly available Mind Big Dataset (MBD) of digits was also used to evaluate the performance of the proposed model. The research done in this paper showed that the band-wise analysis of EEG signals in a complex scenario resulted in improved results as compared to the scenario used in the previously existing work for digit classification using EEG signals. The proposed BrainDigiCNN model achieved the highest average accuracy of 96.99%. The average classification accuracy of 98.27% was achieved for the MBD dataset of 14 channel device EMOTIV EPOC+ and 89.62% on the MBD dataset of 5-channel EMOTIV Insight. The statistical analysis of the proposed model on traditional Machine Learning (ML) classifiers using paired t-test resulted in a p-value less than 0.05 which shows the significant difference between the proposed model and ML classifiers.
Chapter
Dementia is a group of symptoms caused by neurodegenerative disease. It is characterized by impairment in memory, reasoning, behavior, and the capability to perform everyday activities. Worldwide, 50 million people have dementia, and nearly 10 million new dementia cases occur each year. Dementia is a significant reason for disability and dependency in late life. Dementia has a physical, psychological, social, and economic influence on dementia patients and their careers, families, and society. Therefore, there is a need for automated early dementia diagnosis that has cognitive as well as electroencephalogram (EEG) components. State-of-the-art methods have been proposed for efficient dementia diagnosis using machine learning (ML) and deep learning (DL) algorithms with imaging data. Usually, imaging diagnosis misses the early signs of neurodegenerative disease; however, these signs are clearly visible in a psychophysiological experiment. Datasets for dementia diagnosis using cognitive tasks are limited, but some recent research has shown significant results using different cognitive tests. Many other EEG-based ML techniques have achieved good accuracy in early dementia diagnosis, but there is still no final solution. This chapter summarizes all the work done to date for dementia diagnosis based on EEG and cognitive task data and compares various ML approaches used in this regard. It also summarizes different ML approaches with advanced EEG signal processing that can guide future researchers, practitioner, and technicians.
Article
Full-text available
Accurate reconstruction of cortical activation from electroencephalography and magnetoencephalography (E/MEG) is a long-standing challenge because of the inherently ill-posed inverse problem. In this paper, a novel algorithm under the empirical Bayesian framework, source imaging with smoothness in spatial and temporal domains (SI-SST), is proposed to address this issue. In SI-SST, current sources are decomposed into the product of spatial smoothing kernel, sparseness encoding coefficients, and temporal basis functions (TBFs). Further smoothness is integrated in the temporal domain with the employment of an underlying autoregressive model. Because sparseness encoding coefficients are constructed depending on overlapped clusters over cortex in this model, we derived a novel update rule based on fixed-point criterion instead of the convexity based approach which becomes invalid in this scenario. Entire variables and hyper parameters are updated alternatively in the variational inference procedure. SI-SST was assessed by multiple metrics with both simulated and experimental datasets. In practice, SI-SST had the superior reconstruction performance in both spatial extents and temporal profiles compared to the benchmarks.
Article
The Electroencephalogram (EEG) signal, as a data carrier that can contain a large amount of information about the human brain in different states, is one of the most widely used metrics for assessing human psychophysiological states. Among a variety of analysis methods, deep learning, especially convolutional neural network (CNN), has achieved remarkable results in recent years as a method to effectively extract features from EEG signals. Although deep learning has the advantages of automatic feature extraction and effective classification, it also faces difficulties in network structure design and requires an army of prior knowledge. Automating the design of these hyperparameters can therefore save experts' time and manpower. Neural architecture search techniques have thus emerged. In this paper, based on an existing gradient-based NAS algorithm, PC-DARTS, with targeted improvements and optimizations for the characteristics of EEG signals. Specifically, we establish the model architecture step by step based on the manually designed deep learning models for EEG discrimination by retaining the framework of the search algorithm and performing targeted optimization of the model search space. Corresponding features are extracted separately according to the frequency domain, time domain characteristics of the EEG signal and the spatial position of the EEG electrode. The architecture was applied to EEG-based emotion recognition and driver drowsiness assessment tasks. The results illustrate that compared with the existing methods, the model architecture obtained in this paper can achieve competitive overall accuracy and better standard deviation in both tasks. Therefore, this approach is an effective migration of NAS technology into the field of EEG analysis and has great potential to provide high-performance results for other types of classification and prediction tasks. This can effectively reduce the time cost for researchers and facilitate the application of CNN in more areas.
Article
Pain is a dynamic, complex and multidimensional experience. The identification of pain from brain activity as neural readout may effectively provide a neural code for pain, and further provide useful information for pain diagnosis and treatment. Advances in neuroimaging and large-scale electrophysiology have enabled us to examine neural activity with improved spatial and temporal resolution, providing opportunities to decode pain in humans and freely behaving animals. This topical review provides a systematical overview of state-of-the-art methods for decoding pain from brain signals, with special emphasis on electrophysiological and neuroimaging modalities. We show how pain decoding analyses can help pain diagnosis and discovery of neurobiomarkers for chronic pain. Finally, we discuss the challenges in the research field and point to several important future research directions.
Article
A rapidly aging population worldwide has spurred interest in developing new strategies to cope with neural declines and neurodegenerative disorders. Noninvasive brain stimulation (NIBS) is increasingly being used to explore functional mechanisms of the brain and induce the therapeutic modulation of behavior, cognition, and emotion. Galvanic vestibular stimulation (GVS), a safe and well-tolerated NIBS technique, is capable of modulating activity in various cortical and subcortical areas involved in vestibular and multisensory processing. A key facet of GVS is that the resultant effects may, in part, be a function of the individual being treated and the stimulus waveform that is delivered. Yet, most GVS studies have utilized the same generic stimulus, chosen from a reduced repertoire of candidates, across all subjects. The future use and, ultimately, clinical adoption of this technology will rely on contributions from the signal processing community to customize stimuli that are optimized for their effect and to exert maximum influence on brain imaging biomarkers. We provide a signal processing-focused overview of the current GVS state of the art in neurorehabilitation, including general stimulation design, concurrent analysis with neuroimaging data, and suggestions for future directions.
Article
Multimodal functional neuroimaging by integrating functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) has the promise of recovering brain activities with high spatiotemporal resolution, which is crucial for neuroscience research and clinical diagnosis. However, the misalignment of the localizations between fMRI and EEG activities may degrade the accuracy of the fMRI-constrained EEG source imaging (ESI) technique. To leverage the complementary spatiotemporal resolution of fMRI and EEG in a data-driven fashion, we propose an asymmetric approach for EEG/fMRI fusion, termed fMRI-informed source imaging based on spatiotemporal basis functions (fMRISI-STBF). fMRI-SI-STBF employs the covariance components (CCs) derived from clusters defined by fMRI and EEG signals as spatial priors within the empirical Bayesian framework. Additionally, fMRI-SI-STBF represents the current source matrix as a linear combination of several unknown temporal basis functions (TBFs) by matrix decomposition. The relative contribution of each of the fMRI-informed and EEG-informed CCs, as well as the number and profiles of the TBFs, are all automatically determined based on the EEG data using variational Bayesian inference. Our results demonstrate that fMRI-SI-STBF can effectively utilize valid fMRI information for ESI and is robust to invalid fMRI priors. This robustness is essential for practical ESI since the validity of fMRI priors is often unclear considering that fMRI is an indirect measure of the neural activity. Moreover, fMRI-SI-STBF can achieve performance improvement by incorporating temporal constraints compared to methods that use spatial constraints only. Under all the simulated conditions, fMRI-SI-STBF reconstructs the source extents, locations and time courses more accurately than the EEG-fMRI ESI methods (i.e., fwMNE, fMRI-SI-SBF) and ESI methods without fMRI priors (i.e., wMNE, LORETA, SBL, SI-STBF, SI-SBF), indicated by the smaller spatial dispersion (average SD < 5 mm), distance of localization error (average DLE < 2 mm), shape error (average SE < 0.9) and larger model evidence values.
Article
Full-text available
Brain sign can be gotten and investigated utilizing an assortment of techniques as depicted in this writing survey. Understanding the conceivable outcomes of systematic techniques extends specialists' points of view for creating mechanical ways to deal with recognizing organic occasions. In particular, EEG sign can be broke down utilizing an assortment of strategies, proposing a mix of techniques could be ideal for simplicity of automated examination and finding of epileptic seizures.
Article
Accurate electroencephalogram (EEG) pattern decoding for specific mental tasks is one of the key steps for the development of brain-computer interface (BCI), which is quite challenging due to the considerably low signal-to-noise ratio of EEG collected at the brain scalp. Machine learning provides a promising technique to optimize EEG patterns toward better decoding accuracy. However, existing algorithms do not effectively explore the underlying data structure capturing the true EEG sample distribution and, hence, can only yield a suboptimal decoding accuracy. To uncover the intrinsic distribution structure of EEG data, we propose a clustering-based multitask feature learning algorithm for improved EEG pattern decoding. Specifically, we perform affinity propagation-based clustering to explore the subclasses (i.e., clusters) in each of the original classes and then assign each subclass a unique label based on a one-versus-all encoding strategy. With the encoded label matrix, we devise a novel multitask learning algorithm by exploiting the subclass relationship to jointly optimize the EEG pattern features from the uncovered subclasses. We then train a linear support vector machine with the optimized features for EEG pattern decoding. Extensive experimental studies are conducted on three EEG data sets to validate the effectiveness of our algorithm in comparison with other state-of-the-art approaches. The improved experimental results demonstrate the outstanding superiority of our algorithm, suggesting its prominent performance for EEG pattern decoding in BCI applications.
Article
Effectively extracting common space pattern (CSP) features from motor imagery (MI) EEG signals is often highly dependent on the filter band selection. At the same time, optimizing the EEG channel combinations is another key issue that substantially affects the SMR feature representations. Although numerous algorithms have been developed to find channels that record important characteristics of MI, most of them select channels in a cumbersome way with low computational efficiency, thereby limiting the practicality of MI-based BCI systems. In this study, we propose the multi-scale optimization (MSO) of spatial patterns, optimizing filter bands over multiple channel sets within CSPs to further improve the performance of MI-based BCI. Specifically, several channel subsets are first heuristically predefined, and then raw EEG data specific to each of these subsets bandpass-filtered at the overlap between a set of filter bands. Further, instead of solving learning problems for each channel subset independently, we propose a multi-view learning based sparse optimization to jointly extract robust CSP features with L 2,1 -norm regularization, aiming to capture the shared salient information across multiple related spatial patterns for enhanced classification performance. A support vector machine (SVM) classifier is then trained on these optimized EEG features for accurate recognition of MI tasks. Experimental results on three public EEG datasets validate the effectiveness of MSO compared to several other competing methods and their variants. These superior experimental results demonstrate that the proposed MSO method has promising potential in MI-based BCIs.
Article
Full-text available
In this paper, we develop a robust sliding-mode nonlinear predictive controller for brain-controlled robots with enhanced performance, safety, and robustness. First, the kinematics and dynamics of a mobile robot are built. After that, the proposed controller is developed by cascading a predictive controller and a smooth sliding mode controller. The predictive controller integrates the human intention tracking with safety guarantee objectives into an optimization problem to minimize the invasion to human intention while maintaining the robot safety. The smooth sliding mode controller is designed to achieve robust desired velocity tracking. The results of human-in-the-loop simulation and robotic experiments both show the efficacy and robust performance of the proposed controller. This work provides an enabling design to enhance the future research and development of brain-controlled robots.
Article
Full-text available
Common spatial patterns (CSP) is a well-known spatial filtering algorithm for multichannel electroencephalogram (EEG) analysis. In this paper, we cast the CSP algorithm in a probabilistic modeling setting. Specifically, probabilistic CSP (P-CSP) is proposed as a generic EEG spatio-temporal modeling framework that subsumes the CSP and regularized CSP algorithms. The proposed framework enables us to resolve the overfitting issue of CSP in a principled manner. We derive statistical inference algorithms that can alleviate the issue of local optima. In particular, an efficient algorithm based on eigendecomposition is developed for maximum a posteriori (MAP) estimation in the case of isotropic noise. For more general cases, a variational algorithm is developed for group-wise sparse Bayesian learning for the P-CSP model and for automatically determining the model size. The two proposed algorithms are validated on a simulated data set. Their practical efficacy is also demonstrated by successful applications to single-trial classifications of three motor imagery EEG data sets and by the spatio-temporal pattern analysis of one EEG data set recorded in a Stroop color naming task.
Article
Full-text available
The MEG/EEG inverse problem is ill-posed, giving different source reconstructions depending on the initial assumption sets. Parametric Empirical Bayes allows one to implement most popular MEG/EEG inversion schemes (minimum norm, LORETA, etc.) within the same generic Bayesian framework. It also provides a cost-function in terms of the variational Free energy -an approximation to the marginal likelihood or evidence of the solution-. In this manuscript, we revisit the algorithm for MEG/EEG source reconstruction with a view to providing a didactic and practical guide. The aim is to promote and help standardize the development and consolidation of other schemes within the same framework. We describe the implementation in the Statistical Parametric Mapping (SPM) software package, carefully explaining each of its stages with help of a simple simulated data example. We focus on the Multiple Sparse Priors (MSP) model, which we compare with the well-known Minimum Norm and LORETA models, using the negative variational free energy for model comparison. The manuscript is accompanied by Matlab scripts to allow the reader to test and explore the underlying algorithm.
Article
Full-text available
Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman's coalescent, Dirichlet diffusion trees and Wishart processes.
Article
Full-text available
Telemonitoring of electroencephalogram (EEG) through wireless body-area networks is an evolving direction in personalized medicine. Among various constraints in designing such a system, three important constraints are energy consumption, data compression, and device cost. Conventional data compression methodologies, although effective in data compression, consumes significant energy and cannot reduce device cost. Compressed sensing (CS), as an emerging data compression methodology, is promising in catering to these constraints. However, EEG is non-sparse in the time domain and also non-sparse in transformed domains (such as the wavelet domain). Therefore, it is extremely difficult for current CS algorithms to recover EEG with the quality that satisfies the requirements of clinical diagnosis and engineering applications. Recently, Block Sparse Bayesian Learning (BSBL) was proposed as a new method to the CS problem. This study introduces the technique to the telemonitoring of EEG. Experimental results show that its recovery quality is better than state-of-the-art CS algorithms, and sufficient for practical use. These results suggest that BSBL is very promising for telemonitoring of EEG and other non-sparse physiological signals.
Chapter
Full-text available
A Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest. When used in conjunction with statistical techniques, the graphical model has several advantages for data analysis. One, because the model encodes dependencies among all variables, it readily handles situations where some data entries are missing. Two, a Bayesian network can be used to learn causal relationships, and hence can be used to gain understanding about a problem domain and to predict the consequences of intervention. Three, because the model has both a causal and probabilistic semantics, it is an ideal representation for combining prior knowledge (which often comes in causal form) and data. Four, Bayesian statistical methods in conjunction with Bayesian networks offer an efficient and principled approach for avoiding the overfitting of data. In this paper, we discuss methods for constructing Bayesian networks from prior knowledge and summarize Bayesian statistical methods for using data to improve these models. With regard to the latter task, we describe methods for learning both the parameters and structure of a Bayesian network, including techniques for learning with incomplete data. In addition, we relate Bayesian-network methods for learning to techniques for supervised and unsupervised learning. We illustrate the graphical-modeling approach using a real-world case study.
Conference Paper
Full-text available
Automatic relevance determination (ARD) and the closely-related sparse Bayesian learning (SBL) framework are effective tools for pruning large numbers of irrelevant features leading to a sparse explanatory subset. However, popular up- date rules used for ARD are either difcult to extend to more general problems of interest or are characterized by non-ideal convergence properties. Moreover, it re- mains unclear exactly how ARD relates to more traditional MAP estimation-based methods for learning sparse representations (e.g., the Lasso). This paper furnishes an alternative means of expressing the ARD cost function using auxiliary func- tions that naturally addresses both of these issues. First, the proposed reformu- lation of ARD can naturally be optimized by solving a series of re-weighted '1 problems. The result is an efcient, extensible algorithm that can be implemented using standard convex programming toolboxes and is guaranteed to converge to a local minimum (or saddle point). Secondly, the analysis reveals that ARD is exactly equivalent to performing standard MAP estimation in weight space using a particular feature- and noise-dependent, non-factorial weight prior. We then demonstrate that this implicit prior maintains several desirable advantages over conventional priors with respect to feature selection. Overall these results suggest alternative cost functions and update procedures for selecting features and promot- ing sparse solutions in a variety of general situations. In particular, the method- ology readily extends to handle problems such as non-negative sparse coding and covariance component estimation.
Article
Full-text available
Many practical methods for finding maximally sparse coefficient expansions involve solving a regression problem using a particular class of concave penalty functions. From a Bayesian perspective, this process is equivalent to maximum a posteriori (MAP) estimation using a sparsity-inducing prior distribution (Type I estimation). Using variational techniques, this distribution can always be conveniently expressed as a maximization over scaled Gaussian distributions modulated by a set of latent variables. Alternative Bayesian algorithms, which operate in latent variable space leveraging this variational representation, lead to sparse estimators reflecting posterior information beyond the mode (Type II estimation). Currently, it is unclear how the underlying cost functions of Type I and Type II relate, nor what relevant theoretical properties exist, especially with regard to Type II. Herein a common set of auxiliary functions is used to conveniently express both Type I and Type II cost functions in either coefficient or latent variable space facilitating direct comparisons. In coefficient space, the analysis reveals that Type II is exactly equivalent to performing standard MAP estimation using a particular class of dictionary-and noise-dependent, non-factorial coefficient priors. One prior (at least) from this class maintains several desirable advantages over all possible Type I methods and utilizes a novel, non-convex approximation to the ℓ 0 norm with most, and in certain quantifiable conditions all, local minima smoothed away. Importantly, the global minimum is always left unaltered unlike standard ℓ 1-norm relaxations. This ensures that any appropriate descent method is guaranteed to locate the maximally sparse solution.
Article
Full-text available
The aim of this paper is to describe a simple procedure for electromagnetic (EEG or MEG) source reconstruction, in the context of group studies. This entails a simple extension of existing source reconstiruction techniques based upon the inversion of hierarchical models. The extension ensures that evoked or induced responses are reconstructed in the same subset of sources, over subjects. Effectively, the procedure aligns the deployment of reconstructed activity over subjects and increases, substantially, the detection of differences between evoked or induced responses at the group or between-subject level.
Article
Full-text available
A Brain-Computer Interface (BCI) is a specific type of human-computer interface that enables the direct communication between human and computers by analyzing brain measurements. Oddball paradigms are used in BCI to generate event-related potentials (ERPs), like the P300 wave, on targets selected by the user. A P300 speller is based on this principle, where the detection of P300 waves allows the user to write characters. The P300 speller is composed of two classification problems. The first classification is to detect the presence of a P300 in the electroencephalogram (EEG). The second one corresponds to the combination of different P300 responses for determining the right character to spell. A new method for the detection of P300 waves is presented. This model is based on a convolutional neural network (CNN). The topology of the network is adapted to the detection of P300 waves in the time domain. Seven classifiers based on the CNN are proposed: four single classifiers with different features set and three multiclassifiers. These models are tested and compared on the Data set II of the third BCI competition. The best result is obtained with a multiclassifier solution with a recognition rate of 95.5 percent, without channel selection before the classification. The proposed approach provides also a new way for analyzing brain activities due to the receptive field of the CNN models.
Article
Full-text available
Version brochée de la seconde édition de 2001 Cet ouvrage couvre l'approche dite bayésienne de l'inférence statistique et en particulier ses aspects décisionnels. Les bases de cette axiomatique (choix de l'a priori, décisions optimales, tests et régions de confiance) sont abordées en détail, ainsi que des ouvertures plus récentes de l'analyse bayésienne comme le choix de modèles, l'utilisation de méthodes numériques stochastiques d'approximation (MCMC), la théorie des lois non informatives (axiomes de Berger-Bernardo) et la relation à la théorie classique de l'admissibilité. Chaque chapitre est complété par une suite extensive d'exercices de difficulté croissante et par des notes bibliographiques sur les thèmes abordés. Ce livre peut être utilisé dans un programme de Master en Mathématiques appliquées, en Biométrie, en Économétrie ou dans tout autre programme faisant appel aux techniques quantitatives de traitement de l'information. Il ne nécessite comme préliminaire qu'un cours de base en théorie des probabilités et en statistique mathématique. Il peut également être utilisé par des étudiants en thèse ou des chercheurs confirmés en quête d'une méthodologie statistique efficace pour l'analyse de leur(s) modèle(s). Winner of the 2004 DeGroot Prize This paperback edition, a reprint of the 2001 edition, is a graduate-level textbook that introduces Bayesian statistics and decision theory. It covers both the basic ideas of statistical theory, and also some of the more modern and advanced topics of Bayesian statistics such as complete class theorems, the Stein effect, Bayesian model choice, hierarchical and empirical Bayes modeling, Monte Carlo integration including Gibbs sampling, and other MCMC techniques. It was awarded the 2004 DeGroot Prize by the International Society for Bayesian Analysis (ISBA) for setting "a new standard for modern textbooks dealing with Bayesian methods, especially those using MCMC techniques, and that it is a worthy successor to DeGroot's and Berger's earlier texts". oui
Article
Full-text available
We describe an asymmetric approach to fMRI and MEG/EEG fusion in which fMRI data are treated as empirical priors on electromagnetic sources, such that their influence depends on the MEG/EEG data, by virtue of maximizing the model evidence. This is important if the causes of the MEG/EEG signals differ from those of the fMRI signal. Furthermore, each suprathreshold fMRI cluster is treated as a separate prior, which is important if fMRI data reflect neural activity arising at different times within the EEG/MEG data. We present methodological considerations when mapping from a 3D fMRI Statistical Parametric Map to a 2D cortical surface and thence to the covariance components used within our Parametric Empirical Bayesian framework. Our previous introduction of a canonical (inverse-normalized) cortical mesh also allows deployment of fMRI priors that live in a template space; for example, from a group analysis of different individuals. We evaluate the ensuing scheme with MEG and EEG data recorded simultaneously from 12 participants, using the same face-processing paradigm under which independent fMRI data were obtained. Because the fMRI priors become part of the generative model, we use the model evidence to compare (i) multiple versus single, (ii) valid versus invalid, (iii) binary versus continuous, and (iv) variance versus covariance fMRI priors. For these data, multiple, valid, binary, and variance fMRI priors proved best for a standard Minimum Norm inversion. Interestingly, however, inversion using Multiple Sparse Priors benefited little from additional fMRI priors, suggesting that they already provide a sufficiently flexible generative model.
Article
Full-text available
Dynamic Causal Modelling (DCM) is an approach first introduced for the analysis of functional magnetic resonance imaging (fMRI) to quantify effective connectivity between brain areas. Recently, this framework has been extended and established in the magneto/encephalography (M/EEG) domain. DCM for M/EEG entails the inversion a full spatiotemporal model of evoked responses, over multiple conditions. This model rests on a biophysical and neurobiological generative model for electrophysiological data. A generative model is a prescription of how data are generated. The inversion of a DCM provides conditional densities on the model parameters and, indeed on the model itself. These densities enable one to answer key questions about the underlying system. A DCM comprises two parts; one part describes the dynamics within and among neuronal sources, and the second describes how source dynamics generate data in the sensors, using the lead-field. The parameters of this spatiotemporal model are estimated using a single (iterative) Bayesian procedure. In this paper, we will motivate and describe the current DCM framework. Two examples show how the approach can be applied to M/EEG experiments.
Article
Full-text available
Objective: Magnetoencephalography (MEG) dipole localization of epileptic spikes is useful in epilepsy surgery for mapping the extent of abnormal cortex and to focus intracranial electrodes. Visually analyzing large amounts of data produces fatigue and error. Most automated techniques are based on matching of interictal spike templates or predictive filtering of the data and do not explicitly include source localization as part of the analysis. This leads to poor sensitivity versus specificity characteristics. We describe a fully automated method that combines time-series analysis with source localization to detect clusters of focal neuronal current generators within the brain that produce interictal spike activity. Methods: We first use an ICA (independent components analysis) method to decompose the multichannel MEG data and identify those components that exhibit spike-like characteristics. From these detected spikes we then find those whose spatial topographies across the array are consistent with focal neural sources, and determine the foci of equivalent current dipoles and their associated time courses. We then perform a clustering of the localized dipoles based on distance metrics that takes into consideration both their locations and time courses. The final step of refinement consists of retaining only those clusters that are statistically significant. The average locations and time series from significant clusters comprise the final output of our method. Results and significance: Data were processed from 4 patients with partial focal epilepsy. In all three subjects for whom surgical resection was performed, clusters were found in the vicinity of the resectioned area. Conclusions: The presented procedure is promising and likely to be useful to the physician as a more sensitive, automated and objective method to help in the localization of the interictal spike zone of intractable partial seizures. The final output can be visually verified by neurologists in terms of both the location and distribution of the dipole clusters and their associated time series. Due to the clinical relevance and demonstrated promise of this method, further investigation of this approach is warranted.
Article
We consider the problem of learning the structure of a pairwise graphical model over continuous and discrete variables. We present a new pairwise model for graphical models with both continuous and discrete variables that is amenable to structure learning. In previous work, authors have considered structure learning of Gaussian graphical models and structure learning of discrete models. Our approach is a natural generalization of these two lines of work to the mixed case. The penalization scheme involves a novel symmetric use of the group-lasso norm and follows naturally from a particular parameterization of the model. Supplementary materials for this article are available online.
Article
Milestones in sparse signal reconstruction and compressive sensing can be understood in a probabilistic Bayesian context, fusing underdetermined measurements with knowledge about low level signal properties in the posterior distribution, which is maximized for point estimation. We review recent progress to advance beyond this setting. If the posterior is used as distribution to be integrated over instead of merely an optimization criterion, sparse estimators with better properties may be obtained, and applications beyond point reconstruction from fixed data can be served. We describe novel variational relaxations of Bayesian integration, characterized as well as posterior maximization, which can be solved robustly for very large models by algorithms unifying convex reconstruction and Bayesian graphical model technology. They excel in difficult real-world imaging problems where posterior maximization performance is often unsatisfactory.
Article
Bayesian nonparametrics works - theoretically, computationally. The theory provides highly flexible models whose complexity grows appropriately with the amount of data. Computational issues, though challenging, are no longer intractable. All that is needed is an entry point: this intelligent book is the perfect guide to what can seem a forbidding landscape. Tutorial chapters by Ghosal, Lijoi and Prünster, Teh and Jordan, and Dunson advance from theory, to basic models and hierarchical modeling, to applications and implementation, particularly in computer science and biostatistics. These are complemented by companion chapters by the editors and Griffin and Quintana, providing additional models, examining computational issues, identifying future growth areas, and giving links to related topics. This coherent text gives ready access both to underlying principles and to state-of-the-art practice. Specific examples are drawn from information retrieval, NLP, machine vision, computational biology, biostatistics, and bioinformatics.
Article
This monograph provides an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks. The application areas are chosen with the following three criteria in mind: (1) expertise or knowledge of the authors; (2) the application areas that have already been transformed by the successful use of deep learning technology, such as speech recognition and computer vision; and (3) the application areas that have the potential to be impacted significantly by deep learning and that have been experiencing research growth, including natural language and text processing, information retrieval, and multimodal information processing empowered by multi-task deep learning.
Article
In this paper, we present an infinite hierarchical non-parametric Bayesian model to extract the hidden factors over observed data, where the number of hidden factors for each layer is unknown and can be potentially infinite. Moreover, the number of layers can also be infinite. We construct the model structure that allows continuous values for the hidden factors and weights, which makes the model suitable for various applications. We use the Metropolis-Hastings method to infer the model structure. Then the performance of the algorithm is evaluated by the experiments. Simulation results show that the model fits the underlying structure of simulated data.
Article
In this work, we propose a hierarchical latent dictionary approach to estimate the time-varying mean and covariance of a process for which we have only limited noisy sam-ples. We fully leverage the limited sample size and redundancy in sensor measurements by transferring knowledge through a hierar-chy of lower dimensional latent processes. As a case study, we utilize Magnetoencephalog-raphy (MEG) recordings of brain activity to identify the word being viewed by a human subject. Specifically, we identify the word category for a single noisy MEG recording, when only given limited noisy samples on which to train.
Article
Multi-subject electroencephalography (EEG) classification involves algorithm development for automatically categorizing brain waves measured from multiple subjects who undergo the same mental task. Common spatial patterns (CSP) or its probabilistic counterpart, PCSP, is a popular discriminative feature extraction method for EEG classification. Models in CSP or PCSP are trained on a subject-by-subject basis so that inter-subject information is neglected. In the case of multi-subject EEG classification, however, it is desirable to capture inter-subject relatedness in learning a model. In this paper we present a nonparametric Bayesian model for a multi-subject extension of PCSP where subject relatedness is captured by assuming that spatial patterns across subjects share a latent subspace. Spatial patterns and the shared latent subspace are jointly learned by variational inference. We use an infinite latent feature model to automatically infer the dimension of the shared latent subspace, placing Indian Buffet process (IBP) priors on our model. Numerical experiments on BCI competition III IVa and IV 2a dataset demonstrate the high performance of our method, compared to PCSP and existing Bayesian multi-task CSP models.
Conference Paper
Multi-subject electroencephalography (EEG) classification involves the categorization of brain waves measured from multiple subjects, each of whom undergoes the same mental task. Common spatial patterns (CSP) or probabilistic CSP (PCSP) are widely used for extracting discriminative features from EEG, although they are trained on a subject-by-subject basis and inter-subject information is neglected. Moreover, the performance is degraded when only a few training samples are available for each subject. In this paper, we present a method for Bayesian CSP with Dirichlet process (DP) priors, where spatial patterns (corresponding to basis vectors) are simultaneously learned and clustered across subjects using variational Bayesian inference, which facilitates a flexible mixture model where the number of components are also learned. Spatial patterns in the same cluster share the hyperparameters of their prior distributions, which means information transfer is facilitated among subjects with similar spatial patterns. Numerical experiments using the BCI competition IV 2a dataset demonstrated the high performance of our method, compared with existing PCSP and Bayesian CSP methods with a single prior distribution.
Article
In many cases, observed brain signals can be assumed as the linear mixtures of unknown brain sources/components. It is the task of blind source separation (BSS) to find the sources. However, the number of brain sources is generally larger than the number of mixtures, which leads to an underdetermined model with infinite solutions. Under the reasonable assumption that brain sources are sparse within a domain, e.g., in the spatial, time, or time-frequency domain, we may obtain the sources through sparse representation. As explained in this article, several other typical problems, e.g., feature selection in brain signal processing, can also be formulated as the underdetermined linear model and solved by sparse representation. This article first reviews the probabilistic results of the equivalence between two important sparse solutions?the 0-norm and 1-norm solutions. In sparse representation-based brain component analysis including blind separation of brain sources and electroencephalogram (EEG) inverse imaging, the equivalence is related to the recoverability of the sources. This article also focuses on the applications of sparse representation in brain signal processing, including components extraction, BSS and EEG inverse imaging, feature selection, and classification. Based on functional magnetic resonance imaging (fMRI) and EEG data, the corresponding methods and experimental results are reviewed.
Article
We consider a class of adaptive MCMC algorithms using a Langevin-type proposal density. We state and prove regularity conditions for the convergence of these algorithms. In addition to these theoretical results we introduce a number of methodological innovations that can be applied much more generally. We assess the performance of these algorithms with simulation studies, including an example of the statistical analysis of a point process driven by a latent log-Gaussian Cox process.
Article
Magnetoencephalography (MEG) is an important non-invasive method for studying activity within the human brain. Source localization methods can be used to estimate spatiotemporal activity from MEG measurements with high temporal resolution, but the spatial resolution of these estimates is poor due to the ill-posed nature of the MEG inverse problem. Recent developments in source localization methodology have emphasized temporal as well as spatial constraints to improve source localization accuracy, but these methods can be computationally intense. Solutions emphasizing spatial sparsity hold tremendous promise, since the underlying neurophysiological processes generating MEG signals are often sparse in nature, whether in the form of focal sources, or distributed sources representing large-scale functional networks. Recent developments in the theory of compressed sensing (CS) provide a rigorous framework to estimate signals with sparse structure. In particular, a class of CS algorithms referred to as greedy pursuit algorithms can provide both high recovery accuracy and low computational complexity. Greedy pursuit algorithms are difficult to apply directly to the MEG inverse problem because of the high-dimensional structure of the MEG source space and the high spatial correlation in MEG measurements. In this paper, we develop a novel greedy pursuit algorithm for sparse MEG source localization that overcomes these fundamental problems. This algorithm, which we refer to as the Subspace Pursuit-based Iterative Greedy Hierarchical (SPIGH) inverse solution, exhibits very low computational complexity while achieving very high localization accuracy. We evaluate the performance of the proposed algorithm using comprehensive simulations, as well as the analysis of human MEG data during spontaneous brain activity and somatosensory stimuli. These studies reveal substantial performance gains provided by the SPIGH algorithm in terms of computational complexity, localization accuracy, and robustness.
Article
We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.
Article
Classifying electroencephalography (EEG) signals is an important step for proceeding EEG-based brain computer interfaces (BCI). Currently, kernel based methods such as the support vector machine (SVM) are considered the state-of-the-art methods for this problem. In this paper, we apply Gaussian process (GP) classification to binary discrimination of motor imagery EEG data. Compared with the SVM, GP based methods naturally provide probability outputs for identifying a trusted prediction which can be used for post-processing in a BCI. Experimental results show that the classification methods based on a GP perform similarly to kernel logistic regression and probabilistic SVM in terms of predictive likelihood, but outperform SVM and K-nearest neighbor (KNN) in terms of 0–1 loss class prediction error.
Article
The data of interest are assumed to be represented as N-dimensional real vectors, and these vectors are compressible in some linear basis B, implying that the signal can be reconstructed accurately using only a small number M Lt N of basis-function coefficients associated with B. Compressive sensing is a framework whereby one does not measure one of the aforementioned N-dimensional signals directly, but rather a set of related measurements, with the new measurements a linear combination of the original underlying N-dimensional signal. The number of required compressive-sensing measurements is typically much smaller than N, offering the potential to simplify the sensing system. Let f denote the unknown underlying N-dimensional signal, and g a vector of compressive-sensing measurements, then one may approximate f accurately by utilizing knowledge of the (under-determined) linear relationship between f and g, in addition to knowledge of the fact that f is compressible in B. In this paper we employ a Bayesian formalism for estimating the underlying signal f based on compressive-sensing measurements g. The proposed framework has the following properties: i) in addition to estimating the underlying signal f, "error bars" are also estimated, these giving a measure of confidence in the inverted signal; ii) using knowledge of the error bars, a principled means is provided for determining when a sufficient number of compressive-sensing measurements have been performed; iii) this setting lends itself naturally to a framework whereby the compressive sensing measurements are optimized adaptively and hence not determined randomly; and iv) the framework accounts for additive noise in the compressive-sensing measurements and provides an estimate of the noise variance. In this paper we present the underlying theory, an associated algorithm, example results, and provide comparisons to other compressive-sensing inversion algorithms in the literature.
Article
We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.
Chapter
Electroencephalograms (EEGs) are becoming increasingly important measurements of brain activity and they have great potential for the diagnosis and treatment of mental and brain diseases and abnormalities. With appropriate interpretation methods they are emerging as a key methodology to satisfy the increasing global demand for more affordable and effective clinical and healthcare services. Developing and understanding advanced signal processing techniques for the analysis of EEG signals is crucial in the area of biomedical research. This book focuses on these techniques, providing expansive coverage of algorithms and tools from the field of digital signal processing. It discusses their applications to medical data, using graphs and topographic images to show simulation results that assess the efficacy of the methods. Additionally, expect to find: explanations of the significance of EEG signal analysis and processing (with examples) and a useful theoretical and mathematical background for the analysis and processing of EEG signals; an exploration of normal and abnormal EEGs, neurological symptoms and diagnostic information, and representations of the EEGs; reviews of theoretical approaches in EEG modelling, such as restoration, enhancement, segmentation, and the removal of different internal and external artefacts from the EEG and ERP (event-related potential) signals; coverage of major abnormalities such as seizure, and mental illnesses such as dementia, schizophrenia, and Alzheimer's disease, together with their mathematical interpretations from the EEG and ERP signals and sleep phenomenon; descriptions of nonlinear and adaptive digital signal processing techniques for abnormality detection, source localization and brain-computer interfacing using multi-channel EEG data with emphasis on non-invasive techniques, together with future topics for research in the area of EEG signal processing.
Article
In this paper, we present an extensive performance evaluation of a novel source localization algorithm, Champagne. It is derived in an empirical Bayesian framework that yields sparse solutions to the inverse problem. It is robust to correlated sources and learns the statistics of non-stimulus-evoked activity to suppress the effect of noise and interfering brain activity. We tested Champagne on both simulated and real M/EEG data. The source locations used for the simulated data were chosen to test the performance on challenging source configurations. In simulations, we found that Champagne outperforms the benchmark algorithms in terms of both the accuracy of the source localizations and the correct estimation of source time courses. We also demonstrate that Champagne is more robust to correlated brain activity present in real MEG data and is able to resolve many distinct and functionally relevant brain areas with real MEG and EEG data.
Article
MEG/EEG are non-invasive imaging techniques that record brain activity with high temporal resolution. However, estimation of brain source currents from surface recordings requires solving an ill-conditioned inverse problem. Converging lines of evidence in neuroscience, from neuronal network models to resting-state imaging and neurophysiology, suggest that cortical activation is a distributed spatiotemporal dynamic process, supported by both local and long-distance neuroanatomic connections. Because spatiotemporal dynamics of this kind are central to brain physiology, inverse solutions could be improved by incorporating models of these dynamics. In this article, we present a model for cortical activity based on nearest-neighbor autoregression that incorporates local spatiotemporal interactions between distributed sources in a manner consistent with neurophysiology and neuroanatomy. We develop a dynamic Maximum a Posteriori Expectation-Maximization (dMAP-EM) source localization algorithm for estimation of cortical sources and model parameters based on the Kalman Filter, the Fixed Interval Smoother, and the EM algorithms. We apply the dMAP-EM algorithm to simulated experiments as well as to human experimental data. Furthermore, we derive expressions to relate our dynamic estimation formulas to those of standard static models, and show how dynamic methods optimally assimilate past and future data. Our results establish the feasibility of spatiotemporal dynamic estimation in large-scale distributed source spaces with several thousand source locations and hundreds of sensors, with resulting inverse solutions that provide substantial performance improvements over static methods.