## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

EEG and MEG are the most common noninvasive brain imaging techniques for monitoring the electrical brain activity and inferring the brain function. The central goal of EEG/MEG analysis is to extract informative brain spatio-temporal-spectral patterns or to infer functional connectivity between different brain areas, which are directly useful for neuroscience or clinical investigations. Due to its potentially complex nature (such as nonstationarity, high-dimensionality, subject variability, low signal-to-noise ratio), EEG/MEG signal processing poses some great challenges for researchers. These challenges can be addressed in a principled manner via Bayesian machine learning (BML). BML is an emerging field that integrates Bayesian statistics, variational methods, and machine learning techniques to solve various problems from regression, prediction, outlier detection, feature extraction and classification. BML has recently gained increasing attention and widespread successes in signal processing and big data analytics, such as in source reconstruction, compressed sensing, and information fusion. To review recent advances and to foster new research ideas, we provide a tutorial on several important emerging BML research topics in EEG/MEG signal processing and present representative examples in EEG/MEG applications.

To read the full-text of this research,

you can request a copy directly from the authors.

... Reconstructing brain activities from electroencephalography (EEG) plays an important role in neuroscience research and clinical treatment [2,15,34]. For example, for drug-resistant epilepsy, the epileptogenic zone can be removed through a surgical intervention. ...

... The position of each dipole is fixed, so cortical activities can be estimated by solving a linear inverse problem. Since the dipoles largely outnumber the scalp sensors, the forward equation of DCD is underdetermined [2,16,34]. To obtain a unique solution, suitable constraints are needed to narrow the solution space. ...

... Since the EEG inverse problem is highly ill-posed, suitable regularization constraints are necessary to obtain unique source solution [13,34]. The traditional L 2 -normbased methods (e.g., wMNE and LORETA) always Fig. 2 Imaging results for different extents. ...

It is a long-standing challenge to reconstruct the locations and extents of cortical neural activities from electroencephalogram (EEG) recordings, especially when the EEG signals contain strong background activities and outlier artifacts. In this work, we propose a robust source imaging method called \(L_1\)R-SSSI. To alleviate the effect of outliers in EEG, \(L_1\)R-SSSI employs the \(L_1\)-loss to model the residual error. To obtain locally smooth and globally sparse estimations, \(L_1\)R-SSSI adopts the structured sparsity constraint, which incorporates the \(L_1\)-norm regularization in both the variation and original source domain. The estimations of \(L_1\)R-SSSI are efficiently obtained using the alternating direction method of multipliers (ADMM) algorithm. Results of simulated and experimental data analysis demonstrate that \(L_1\)R-SSSI effectively suppresses the effect of the outlier artifacts in EEG. \(L_1\)R-SSSI outperforms the traditional \(L_2\)-norm-based methods (e.g., wMNE, LORETA), and SISSY, which employs \(L_2\)-norm loss and structured sparsity, indicated by the larger AUC (average AUC \(>0.80\)), smaller SD (average SD \(<50\) mm), DLE (average DLE \(<10\) mm) and RMSE (average RMSE \(<1.75\)) values under all the numerically simulated conditions. \(L_1\)R-SSSI also provides better estimations of extended sources than the method with \(L_1\)-loss and \(L_p\)-norm regularization term (e.g., LAPPS).

... Furthermore, these properties are included in the overall analysis through the assumed EEG sources' model and various assumptions about the model. Clearly, the linear observation model [17,18], the linear dynamical model (or Kalman Filters) [17,18], and the multiple measurement vector (MMV) model [19] make different generative modelling assumptions about the underlying mechanisms that produce the EEG data. e spatial properties of EEG sources are encoded into the linear observation model through the use of prior distributions or regularization terms. ...

... Furthermore, these properties are included in the overall analysis through the assumed EEG sources' model and various assumptions about the model. Clearly, the linear observation model [17,18], the linear dynamical model (or Kalman Filters) [17,18], and the multiple measurement vector (MMV) model [19] make different generative modelling assumptions about the underlying mechanisms that produce the EEG data. e spatial properties of EEG sources are encoded into the linear observation model through the use of prior distributions or regularization terms. ...

... where N is the symbol for Gaussian distribution. In Sparse Bayesian Learning literature [18,25,26], a common approach is to assume that the covariance matrix Λ is a diagonal matrix with elements a −1 i , i � 1, . . . , 3M. ...

We propose a new method for EEG source localization. An efficient solution to this problem requires choosing an appropriate regularization term in order to constraint the original problem. In our work, we adopt the Bayesian framework to place constraints; hence, the regularization term is closely connected to the prior distribution. More specifically, we propose a new sparse prior for the localization of EEG sources. The proposed prior distribution has sparse properties favoring focal EEG sources. In order to obtain an efficient algorithm, we use the variational Bayesian (VB) framework which provides us with a tractable iterative algorithm of closed-form equations. Additionally, we provide extensions of our method in cases where we observe group structures and spatially extended EEG sources. We have performed experiments using synthetic EEG data and real EEG data from three publicly available datasets. The real EEG data are produced due to the presentation of auditory and visual stimulus. We compare the proposed method with well-known approaches of EEG source localization and the results have shown that our method presents state-of-the-art performance, especially in cases where we expect few activated brain regions. The proposed method can effectively detect EEG sources in various circumstances. Overall, the proposed sparse prior for EEG source localization results in more accurate localization of EEG sources than state-of-the-art approaches.

... Concrete instantiations of this approach have further been introduced under the names sparse Bayesian learning (SBL) or automatic relevance determination (ARD) [8], variational Bayes (VB) [9], [36] and iteratively-reweighted MAP estimation [10], [37]. Interested readers are referred to [38] for a comprehensive survey on Bayesian machine learning techniques for EEG/MEG signals. To distinguish all these Type-II variants from classical ML and MAP approaches not involving hyperparameter learning, the latter are also referred to as Type-I approaches. ...

... This can be easily guaranteed if Eq. (38) is convex. While the second term in (38) is convex, the first term, log |Γ|, is in fact concave, which hampers conclusions concerning the convexity of their sum. However, we can use the concept of geodesic convexity or g-convexity from non-Euclidean and geometric optimization, which enables us to prove that any local minimum of Eq. (38) is actually a global minimum. ...

... (38) is g-convex; hence, any local minimum of L k EM (γ|γ k ) is a global minimum according to Proposition 10. This proves that condition [A3] is fulfilled and completes the proof of Proposition 3 ...

Methods for electro- or magnetoencephalography (EEG/MEG) based brain source imaging (BSI) using sparse Bayesian learning (SBL) have been demonstrated to achieve excellent performance in situations with low numbers of distinct active sources, such as event-related designs. This paper extends the theory and practice of SBL in three important ways. First, we reformulate three existing SBL algorithms under the majorization-minimization (MM) framework. This unification perspective not only provides a useful theoretical framework for comparing different algorithms in terms of their convergence behavior, but also provides a principled recipe for constructing novel algorithms with specific properties by designing appropriate bounds of the Bayesian marginal likelihood function. Second, building on the MM principle, we propose a novel method called LowSNR-BSI that achieves favorable source reconstruction performance in low signal-to-noise-ratio settings. Third, precise knowledge of the noise level is a crucial requirement for accurate source reconstruction. Here we present a novel principled technique to accurately learn the noise variance from the data either jointly within the source reconstruction procedure or using one of two proposed cross-validation strategies. Empirically, we could show that the monotonous convergence behavior predicted from MM theory is confirmed in numerical experiments. Using simulations, we further demonstrate the advantage of LowSNR-BSI over conventional SBL in low-SNR regimes, and the advantage of learned noise levels over estimates derived from baseline data. To demonstrate the usefulness of our novel approach we show neurophysiologically plausible source reconstructions on averaged auditory evoked potential data.

... This problem is addressed by imposing prior distributions on the model parameters and adopting a Bayesian treatment. This can be performed either through Maximum-a-Posteriori (MAP) estimation (Type-I Bayesian learning) [23]- [27] or, when the model has unknown hyperparameters, through Type-II Maximum-Likelihood estimation (Type-II Bayesian learning) [28]- [32]. In this paper, we focus on Type-II Bayesian learning, which assumes a family of prior distributions p(X|Θ) parameterized by a set of hyperparameters Θ. ...

... Summing up (32) and (34) proves inequality (30) and concludes the first part of the proof. ...

We consider the reconstruction of brain activity from electroencephalography (EEG). This inverse problem can be formulated as a linear regression with independent Gaussian scale mixture priors for both the source and noise components. Crucial factors influencing accuracy of source estimation are not only the noise level but also its correlation structure, but existing approaches have not addressed estimation of noise covariance matrices with full structure. To address this shortcoming, we develop hierarchical Bayesian (type-II maximum likelihood) models for observations with latent variables for source and noise, which are estimated jointly from data. As an extension to classical sparse Bayesian learning (SBL), where across-sensor observations are assumed to be independent and identically distributed, we consider Gaussian noise with full covariance structure. Using the majorization-maximization framework and Riemannian geometry, we derive an efficient algorithm for updating the noise covariance along the manifold of positive definite matrices. We demonstrate that our algorithm has guaranteed and fast convergence and validate it in simulations and with real MEG data. Our results demonstrate that the novel framework significantly improves upon state-of-the-art techniques in the real-world scenario where the noise is indeed non-diagonal and fully-structured. Our method has applications in many domains beyond biomagnetic inverse problems.

... Previous work has applied Machine Learning (ML) to MEG data for this purpose [1,6,7]. ML models have also been applied to detect AD using other neuroimaging signals such as functional magnetic resonance imaging (fMRI) [8,9], magnetic resonance imaging (MRI) [2] and electroencephalography (EEG) [10,11,12]. Similarly, neuroimaging data have been analyzed with machine learning techniques to investigate a large variety of brain conditions and processes including rapid eye movement disorders [11], speech tasks classification [13], Parkinson's disease [14], epilepsy [15], spinal cord injury [15], scene representations [16] and ocular and cardiac artifacts detection [17]. ...

... Similarly, neuroimaging data have been analyzed with machine learning techniques to investigate a large variety of brain conditions and processes including rapid eye movement disorders [11], speech tasks classification [13], Parkinson's disease [14], epilepsy [15], spinal cord injury [15], scene representations [16] and ocular and cardiac artifacts detection [17]. The ML models that have been employed in this context are: (1) classical models -Random Forest, Bayesian Network, Decision trees, K-Nearest Neighbors, Logistic Regression, Support Vector Machines and Linear discriminant analysis - [6,7], (2) deep learning models -Convolutional neural networks (CNN) [2,9,11,14,15], Recurrent neural networks (RNN) [10] and custom architectures [8,13,17,18] -(3) Bayesian ML models [12]. ...

The early detection of Alzheimer’s disease can potentially make eventual treatments more effective. This work presents a deep learning model to detect early symptoms of Alzheimer’s disease using synchronization measures obtained with magnetoencephalography. The proposed model is a novel deep learning architecture based on an ensemble of randomized blocks formed by a sequence of 2D-convolutional, batch-normalization and pooling layers. An important challenge is to avoid overfitting, as the number of features is very high (25755) compared to the number of samples (132 patients). To address this issue the model uses an ensemble of identical sub-models all sharing weights, with a final stage that performs an average across sub-models. To facilitate the exploration of the feature space, each sub-model receives a random permutation of features. The features correspond to magnetic signals reflecting neural activity and are arranged in a matrix structure interpreted as a 2D image that is processed by 2D convolutional networks.
The proposed detection model is a binary classifier (disease/non-disease), which compared to other deep learning architectures and classic machine learning classifiers, such as random forest and support vector machine, obtains the best classification performance results with an average F1-score of 0.92. To perform the comparison a strict validation procedure is proposed, and a thorough study of results is provided.

... For the resolution of the M/EEG inverse problem many sophisticated algorithms have been developed during the last decades considering different techniques: regularization [4], [6], [9], machine learning [7], [17] and probabilistic approaches [12], [13]. One requested algorithm's feature is the capability to take into account those prior information on the source localization that come from clinical analysis in order to a priori exclude/rank some brain regions, thus defining the so-called Region of Interest (ROI) [16]. ...

... Along this line, the L p norm iterative sparse solution (LPISS) [11] is an iterative sparse learning algorithm based on a L p norm. Sparse Bayesian learning approaches (SBL) [12]- [14] cast the inverse problem under a empirical Bayesian framework where hyperparameters can be automatically determined and sparse solutions can be obtained. In particular, Champagne [15], [16] is an SBL approach that estimates the number, location, and time course of the sources in a principled fashion. ...

Electromagnetic source imaging (ESI) requires solving a highly ill-posed inverse problem. To seek a unique solution, traditional ESI methods impose various forms of priors that may not accurately reflect the actual source properties, which may hinder their broad applications. To overcome this limitation, in this paper a novel data-synthesized spatio-temporally convolutional encoder-decoder network method termed DST-CedNet is proposed for ESI. DST-CedNet recasts ESI as a machine learning problem, where discriminative learning and latent-space representations are integrated in a convolutional encoder-decoder network (CedNet) to learn a robust mapping from the measured electroencephalography/magnetoencephalography (E/MEG) signals to the brain activity. In particular, by incorporating prior knowledge regarding dynamical brain activities, a novel data
synthesis strategy is devised to generate large-scale samples for effectively training CedNet. This stands in contrast to traditional ESI methods where the prior information is often enforced via constraints primarily aimed for mathematical convenience. Extensive numerical experiments as well as analysis of a real MEG and Epilepsy EEG dataset demonstrate that DST-CedNet outperforms several state-of-the-art ESI methods in robustly estimating source signals under a variety of source configurations.

... SBL provides a powerful framework for learning parsimonious linear latent variable models from EEG, with applications encompassing electromagnetic source imaging [41], [42], probabilistic generative modeling of EEG (e.g., oscillations and ERPs) [43], [44], and EEG decoding [11], [45]. In essence, SBL is an empirical Bayes paradigm that imposes parameterized prior on the latent variables and enforces sparsity via maximizing the marginal likelihood (also known as Type-II maximum likelihood, or evidence maximization). ...

Decoding brain activity from non-invasive electroencephalography (EEG) is crucial for brain-computer interfaces (BCIs) and the study of brain disorders. Notably, end-to-end EEG decoding has gained widespread popularity in recent years owing to the remarkable advances in deep learning research. However, the sample sizes in many EEG studies are often too limited to prevent generic deep learning models from overfitting the highly noisy EEG data, leading to only suboptimal generalization performance. To address this fundamental limitation, this paper proposes a novel end-to-end EEG decoding algorithm in which the spatio-temporal filters and the classifier are all encoded in a low-rank weight matrix and optimized under a principled sparse Bayesian learning (SBL) framework. Importantly, this SBL framework also enables us to learn the hyperparameters that optimally penalize the model in a Bayesian fashion. The performance of the proposed decoding algorithm is systematically assessed on five motor imagery EEG datasets (N = 192) and an emotion recognition EEG dataset (N = 45), in comparison with several contemporary algorithms, including end-to-end deep learning-based EEG decoding algorithms. The classification results demonstrate that our algorithm significantly outperforms the competing algorithms while yielding neurophysiologically meaningful spatio-temporal patterns. Our algorithm therefore advances the state of the art by providing a novel EEG-tailored machine learning tool for decoding brain activity.

... NB is used to compute the probability using Bayesian theory, It provides simplest implementation and little training time with highest accuracy while computing the probabilities of noisy data. NB method includes Multinomial NB, Bernoulli NB, and Complement NB (Wu et al., 2015). ...

With the emergence of the covid 19 pandemic, E-learning usage was the only way to solve the problem of study interruption in educational institutions and universities. Therefore, this field reserved significant attention in current times. In this paper, we used ten Machine Learning (ML) algorithms: Decision Tree(DT), Random Forest(RF), Logistic Regression(LR), SGD Classifier, Multinomial NB, K- Nearest Neighbors Classifier(KNN), Ridge Classifier, Nearest Centroid, Complement NB and Bernoulli NB) to build a prediction system based on artificial intelligence techniques to predict the difficulties students face in using the e-learning management system, to support related decision-making. Which, in turn, contributes supporting the sustainable development of technology at the university. From the results obtained, we detect the important factors that affect the use of E-learning to solve students' learning difficulties using LMS by building a prediction system based on AI techniques.

... Furthermore, the Bayesian statistical method in combination with a logistic regression model is an efficient approach to understanding a problem domain and predicting the outcomes of interventions. [15,16] In the present study, a logistic regression model with Bayesian supervised learning inference was employed to elucidate quantitative effects of 1-to 6-comorbidity risk factors for dementia, namely depression, vascular disease, severe head injury, hearing loss, DM, and senile cataract, which were identified from a nationwide longitudinal population-based database. ...

Dementia is one of the most burdensome illnesses in elderly populations worldwide. However, the literature about multiple risk factors for dementia is scant. To develop a simple, rapid, and appropriate predictive tool for the clinical quantitative assessment of multiple risk factors for dementia. A population-based cohort study. Based on the Taiwan National Health Insurance Research Database, participants first diagnosed with dementia from 2000 to 2009 and aged ≥65 years in 2000 were included. A logistic regression model with Bayesian supervised learning inference was implemented to evaluate the quantitative effects of 1-to 6-comorbidity risk factors for dementia in the elderly Taiwanese population: depression, vascular disease, severe head injury, hearing loss, diabetes mellitus (DM), and senile cataract, identified from a nationwide longitudinal population-based database. This study enrolled 4749 (9.5%) patients first diagnosed as having dementia. Aged, female, urban residence, and low income were found as independent sociodemographic risk factors for dementia. Among all odds ratios (ORs) of 2-comorbidity risk factors for dementia, comorbid depression and vascular disease had the highest adjusted OR of 6.726. The 5-comorbidity risk factors, namely depression, vascular disease, severe head injury, hearing loss, and DM, exhibited the highest OR of 8.767. Overall, the quantitative effects of 2 to 6 comorbidities and age difference on dementia gradually increased; hence, their ORs were less than additive. These results indicate that depression is a key comorbidity risk factor for dementia. The present findings suggest that physicians should pay more attention to the role of depression in dementia development. Depression is a key cormorbidity risk factor for dementia. It is the urgency of evaluating the nature of the link between depression and dementia; and further testing what extent controlling depression could effectively lead to the prevention of dementia. Abbreviations: ADVI = automatic differentiation variational inference, DM = diabetes mellitus, ICD-9-CM = International Classification of Disease, Ninth Revision, Clinical Modification, NHIRD = National Health Insurance Research Database, OR = odds ratio.

... Over last decades, the Bayesian compressive sensing (BCS) framework, which is originated from sparse Bayesian learning (SBL), has become an active sub-class of sparse signal reconstruction algorithms [1-7] and has been widely applied in many fields, such as array synthesis [8], directions-of-arrival (DOA) estimation [9], radar localisation and imaging [10][11][12], and electrocardiogram, electroencephalography (EEG), and magnetoencephalography (MEG) signal processing [13,14]. ...

Bayesian compressive sensing (BCS) is an important sub‐class of sparse signal reconstruction algorithms. In this paper, a modified complex multitask Bayesian compressive sensing (MCMBCS) algorithm using the Laplacian scale mixture (LSM) prior is proposed. The LSM prior is first introduced into the complex BCS framework by exploiting its better sparse characteristic and flexibility than traditional Laplacian prior. Furthermore, by integrating out the noise variance analytically, the MCMBCS algorithm significantly improves the signal recovery performance than the original CMBCS. More importantly, the authors not only present the iterative algorithm but also develop the sub‐optimal fast implementation method based on the marginal likelihood maximisation, which dramatically reduce the computational complexity. Finally, sufficient numerical simulations validate the better performance of the proposed algorithm in reconstruction accuracy and computational effectiveness than existing work. It is revealed that the proposed algorithm has great potential in the complex‐valued signal processing field.

... NB is used to compute the probability using Bayesian theory, It provides simplest implementation and little training time with highest accuracy while computing the probabilities of noisy data. NB method includes Multinomial NB, Bernoulli NB, and Complement NB (Wu et al., 2015). ...

With the emergence of the covid 19 pandemic, E-learning usage was the only way to solve the problem of study interruption in educational institutions and universities. Therefore, this field has garnered significant attention in recent times. In this paper, we used ten machine-learning algorithms (Logistic Regression, Decision Tree, Random Forest, SGD Classifier, Multinomial NB, K-Neighbors Classifier, Ridge Classifier, Nearest Centroid, Complement NB and Bernoulli NB) to build a prediction system based on artificial intelligence techniques to predict the difficulties students face in using the e-learning management system, and support related decision-making. Which, in turn, contributes to supporting the sustainable development of technology at the university. From the results obtained, we found the important factors that affect the use of E-learning to solve students' learning difficulties by using LMS.

... • Naïve Bayes: Naïve Bayes [95] classifiers are simple probabilistic classifiers. They are based on the following assumption: all the input features are independent of each other and no correlation exists between them. ...

Video content now occupies about 82% of global internet traffic. This large percentage is due to the revolution in video content consumption. On the other hand, the market is increasingly demanding videos with higher resolutions and qualities. This causes a significant increase in the amount of data to be transmitted. Hence the need to develop video coding algorithms even more efficient than existing ones to limit the increase in the rate of data transmission and ensure a better quality of service. In addition, the impressive consumption of multimedia content in electronic products has an ecological impact. Therefore, finding a compromise between the complexity of algorithms and the efficiency of implementations is a new challenge. As a result, a collaborative team was created with the aim of developing a new video coding standard, Versatile Video Coding – VVC/H.266. Although VVC was able to achieve a more than 40% reduction in throughput compared to HEVC, this does not mean at all that there is no longer a need to further improve coding efficiency. In addition, VVC adds remarkable complexity compared to HEVC. This thesis responds to these problems by proposing three new encoding methods. The contributions of this research are divided into two main axes. The first axis is to propose and implement new compression tools in the new standard, capable of generating additional coding gains. Two methods have been proposed for this first axis. These two methods rely on the derivation of prediction information at the decoder side. This is because increasing encoder choices can improve the accuracy of predictions and yield less energy residue, leading to a reduction in bit rate. Nevertheless, more prediction modes involve more signaling to be sent into the binary stream to inform the decoder of the choices that have been made at the encoder. The gains mentioned above are therefore more than offset by the added signaling. If the prediction information has been derived from the decoder, the latter is no longer passive, but becomes active hence the concept of intelligent decoder. Thus, it will be useless to signal the information, hence a gain in signalization. Each of the two methods offers a different intelligent technique than the other to predict information at the decoder level. The first technique constructs a histogram of gradients to deduce different intra-prediction modes that can then be combined by means of prediction fusion, to obtain the final intra-prediction for a given block. This fusion property makes it possible to more accurately predict areas with complex textures, which, in conventional coding schemes, would rather require partitioning and/or finer transmission of high-energy residues. The second technique gives VVC the ability to switch between different interpolation filters of the inter prediction. The deduction of the optimal filter selected by the encoder is achieved through convolutional neural networks. The second axis, unlike the first, does not seek to add a contribution to the VVC algorithm. This axis rather aims to build an optimized use of the already existing algorithm. The ultimate goal is to find the best possible compromise between the compression efficiency delivered and the complexity imposed by VVC tools. Thus, an optimization system is designed to determine an effective technique for activating the new coding tools. The determination of these tools can be done either using artificial neural networks or without any artificial intelligence technique.

... Detection of spatiotemporal features of esophageal abnormality from endoscopic videos by incorporating 3D convolutional neural network and convolutional long shortterm memories (LSTM) reported in [38] for the first time. Bayesian machine learning (BML) was discussed as a method to extract the electroencephalography (EEG) and magnetoencephalography (MEG) informative brain spatiotemporal-spectral patterns [39]. ...

A low-cost machine learning (ML) algorithm is proposed and discussed for spatial tracking of unknown, correlated signals in localized, ad-hoc wireless sensor networks. Each sensor is modeled as one neuron and a selected subset of these neurons are called to identify the spatial signal. The algorithm is implemented in two phases of spatial modeling and spatial tracking. The spatial signal is modeled using its M iso-contour lines at levels {ℓj}j=1M and those sensors that their sensor observations are in Δ margin of any of these levels report their sensor observations to the fusion center (FC) for spatial signal reconstruction. In spatial modeling phase, the number of these contour lines, their levels and a proper Δ are identified. In this phase, the algorithm may either use adaptive-weight stochastic gradient or scaled stochastic gradient method to select a proper Δ. Additive white Gaussian noise (AWGN) with zero mean is assumed along with the sensor observations. To reduce the observation noise’s effect, each sensor applies moving average filter on its observation to drastically reduce the effect of noise. The modeling performance, the cost and the convergence of the algorithm are discussed based on extensive computer simulations and reasoning. The algorithm is proposed for climate and environmental monitoring. In this paper, the percentage of wireless sensors that initiate a communication attempt is assumed as cost. The performance evaluation results show that the proposed spatial tracking approach is low-cost and can model the spatial signal over time with the same performance as that of spatial modeling.

... Concrete instantiations of this approach have further been introduced under the names sparse Bayesian learning (SBL) (Tipping, 2001) or automatic relevance determination (ARD) (Tipping, 2000), kernel Fisher discriminant (KFD) (Mika et al., 2001), variational Bayes (VB) (Seeger and Wipf, 2010;Wipf and Nagarajan, 2009) and iteratively-reweighted MAP estimation (Gorodnitsky et al., 1995;. Interested readers are referred to (Wu et al., 2016) for a comprehensive survey on Bayesian machine learning techniques for EEG/MEG signals. To distinguish all these Type-II variants from classical ML and MAP approaches not involving hyperparameter learning, the latter are also referred to as Type-I approaches. ...

Methods for electro- or magnetoencephalography (EEG/MEG) based brain source imaging (BSI) using sparse Bayesian learning (SBL) have been demonstrated to achieve excellent performance in situations with low numbers of distinct active sources, such as event-related designs. This paper extends the theory and practice of SBL in three important ways. First, we reformulate three existing SBL algorithms under the majorization-minimization (MM) framework. This unification perspective not only provides a useful theoretical framework for comparing different algorithms in terms of their convergence behavior, but also provides a principled recipe for constructing novel algorithms with specific properties by designing appropriate bounds of the Bayesian marginal likelihood function. Second, building on the MM principle, we propose a novel method called LowSNR-BSI that achieves favorable source reconstruction performance in low signal-to-noise-ratio (SNR) settings. Third, precise knowledge of the noise level is a crucial requirement for accurate source reconstruction. Here we present a novel principled technique to accurately learn the noise variance from the data either jointly within the source reconstruction procedure or using one of two proposed cross-validation strategies. Empirically, we could show that the monotonous convergence behavior predicted from MM theory is confirmed in numerical experiments. Using simulations, we further demonstrate the advantage of LowSNR-BSI over conventional SBL in low-SNR regimes, and the advantage of learned noise levels over estimates derived from baseline data. To demonstrate the usefulness of our novel approach, we show neurophysiologically plausible source reconstructions on averaged auditory evoked potential data.

... Detection of spatiotemporal features of esophageal abnormality from endoscopic 108 videos by incorporating 3D convolutional neural network and convolutional long short-109 term memories (LSTM) reported in [38] for the first time. Bayesian machine learn-110 ing (BML) was discussed as a method to extract the electroencephalography (EEG) 111 and magneto-encephalography (MEG) informative brain spatiotemporal-spectral pat-112 terns [39]. ...

A low-cost machine learning (ML) algorithm is proposed and discussed for spatial tracking of unknown, correlated signals in localized, ad-hoc wireless sensor networks. Each sensor is modeled as one neuron and a selected subset of these neurons are called to identify the spatial signal. The algorithm is implemented in two phases of spatial modeling and spatial tracking. The spatial signal is modeled using its M iso-contour lines at levels {ℓj}j=1M and those sensors that their sensor observations are in Δ margin of any of these levels report their sensor observations to the fusion center (FC) for spatial signal reconstruction. In spatial modeling phase, the number of these contour lines, their levels and a proper Δ are identified. In this phase, the algorithm may either use adaptive-weight stochastic gradient or scaled stochastic gradient method to select a proper Δ. Additive white Gaussian noise (AWGN) with zero mean is assumed along with the sensor observations. To reduce the observation noise’s effect, each sensor applies moving average filter on its observation to drastically reduce the effect of noise. The modeling performance, the cost and the convergence of the algorithm are discussed based on extensive computer simulations and reasoning. The algorithm is proposed for environmental monitoring. In this paper, the percentage of the communication attempts of wireless sensors is assumed as cost. Performance evaluation results show that the proposed spatial tracking approach is low cost and can model the spatial signal over time with the same performance as that of spatial modeling.

... Various methods have been used to extract features from EEG signals. Popular methods are entropy [14], detrended moving average (DMA) [15], isomap-based estimation [16], Bayesian [17], and others [18]. In the past decade, entropy algorithms have been widely used for features extraction in anaesthetic EEG signals. ...

Anaesthesia is a state of temporary controlled loss of awareness induced for medical operations. An accurate assessment of the depth of anaesthesia (DoA) helps anesthesiologists to avoid awareness during surgery and keep the recovery period short. However, the existing DoA algorithms have limitations, such as not robust enough for different patients and having time delay in assessment. In this study, to develop a reliable DoA measurement method, pre-denoised electroencephalograph (EEG) signals are divided into ten frequency bands ( α, β1, β2, β3, β4, β, βγ, γ, δ and θ ), and the features are extracted from different frequency bands using spectral entropy (SE) methods. SE from the beta-gamma frequency band (21.5–38.5 Hz) and SE from the beta frequency band show the highest correlation (R-squared value: 0.8458 and 0.7312, respectively) with the most popular DoA index, bispectral index (BIS). In this research, a new DoA index is developed based on these two SE features for monitoring the DoA. The highest Pearson correlation coefficient by comparing the BIS index for testing data is 0.918, and the average is 0.80. In addition, the proposed index shows an earlier reaction than the BIS index when the patient goes from deep anaesthesia to moderate anaesthesia, which means it is more suitable for the real-time DoA assessment. In the case of poor signal quality (SQ), while the BIS index exhibits inflexibility with cases of poor SQ, the new proposed index shows reliable assessment results that reflect the clinical observations.

... The emerging intelligent applications and high-performance systems require more complexity and demand sensory units to accurately describe the physical object. The decision-making unit or algorithm can therefore output a more reliable result (Khezri and Jahed, 2007;Wu et al., 2016;He et al., 2017;Liang et al., 2018Liang et al., , 2019. Depending on the signal acquiring position, Figure 1 illustrates four biopotential sensors and two widely used wearable sensors along with their learning systems and applications, which have also been summarized in Table 1. ...

Wearable devices are a fast-growing technology with impact on personal healthcare for both society and economy. Due to the widespread of sensors in pervasive and distributed networks, power consumption, processing speed, and system adaptation are vital in future smart wearable devices. The visioning and forecasting of how to bring computation to the edge in smart sensors have already begun, with an aspiration to provide adaptive extreme edge computing. Here, we provide a holistic view of hardware and theoretical solutions toward smart wearable devices that can provide guidance to research in this pervasive computing era. We propose various solutions for biologically plausible models for continual learning in neuromorphic computing technologies for wearable sensors. To envision this concept, we provide a systematic outline in which prospective low power and low latency scenarios of wearable sensors in neuromorphic platforms are expected. We successively describe vital potential landscapes of neuromorphic processors exploiting complementary metal-oxide semiconductors (CMOS) and emerging memory technologies (e.g., memristive devices). Furthermore, we evaluate the requirements for edge computing within wearable devices in terms of footprint, power consumption, latency, and data size. We additionally investigate the challenges beyond neuromorphic computing hardware, algorithms and devices that could impede enhancement of adaptive edge computing in smart wearable devices.

... To date, machine learning methods have generated growing interests in medicine for its promising applications in medical diagnosis, EEG and medical imaging analysis, and mental health (Wu et al., 2016;Rajkomar et al. 2019;Chambon et al., 2016). Deep learning (or deep neural network) approaches have demonstrated extraordinary regression or classification performance as a result of the increased computing power and the mining of a large number of samples (LeCun et al., 2015;Schmidhuber, 2015). ...

Objective:
Automatic detection of interictal epileptiform discharges (IEDs, short as ``spikes'') from an epileptic brain can help predict seizure recurrence and support the diagnosis of epilepsy. Developing fast, reliable and robust detection methods for IEDs based on scalp or intracortical EEG may facilitate online seizure monitoring and closed-loop neurostimulation.
Approach:
We developed a new deep learning approach, which employs a long short-term memory (LSTM) network architecture (``IEDnet'') and an auxiliary classifier generative adversarial network (AC-GAN), to train on both expert-annotated and augmented spike events from intracranial electroencephalography (iEEG) recordings of epilepsy patients. We validated our IEDnet with two real-world iEEG datasets, and compared IEDnet with the support vector machine (SVM) and random forest (RF) classifiers on their detection performances.
Main results:
IEDnet achieved excellent cross-validated detection performances in terms of both sensitivity and specificity, and outperformed SVM and RF. Synthetic spike samples augmented by AC-GAN further improved the detection performance. In addition, the performance of IEDnet was robust with respect to the sampling frequency and noise. Furthermore, we also demonstrated the cross-institutional generalization ability of IEDnet while testing between two datasets.
Significance:
IEDnet achieves excellent detection performances in identifying interictal spikes. AC-GAN can produce augmented iEEG samples to improve supervised deep learning.

... Therefore, the art demand preference analysis system proposed in this paper can detect the desire of consumers in major cities to eat various artworks. With the support of big data background, the machine learning method [9][10] is used to monitor the search volume of different artworks by users, and then statistical analysis is conducted on the preferences of users in various provinces and cities for artworks. Search in a large number of search keywords to form a memory, will set good retrieval requirements, real-time monitoring of user search, establish an adaptive search model with user synchronous changes, and predict the demand of residents for art in this area. ...

In less than 30 years, the Chinese art market has completed a gorgeous transformation from traditional to modern. In the process of development, great changes have taken place in the form of China’s art market. The scale of art transaction is gradually expanding, and the price of Chinese art market is also getting higher and higher. In order to better achieve efficient retrieval under the background of big data and know the demand of contemporary art market, this paper studies a kind of art demand preference analysis system based on big data. Based on the big data environment, using machine learning method, more accurate detection of the general public’s interest in art. By extracting user preferences from the big data of users’ search records, a search model with synchronous changes with user preferences is established by using machine learning method, so as to predict users’ interest preference for artworks in advance. The system is applied to Baidu, Jingdong and other platforms, and a questionnaire survey is carried out. The results show that the detected data are similar to the survey results.

... Nonparametric Bayesian approaches like Gaussian process (GP) is closely related to ANN. GP has recently been used for system identification purpose [88][89][90] and applied to analyse neurophysiological signals [91], such as the use of GP modelling for EEG-based seizure detection and prediction [92] and heteroscedastic modelling of noisy highdimensional MEG data [93]. Compared with ANN, GP can be applied to model datasets with small sample size and it has a relatively small number of hyperparameters. ...

The human nervous system is one of the most complicated systems in nature. Complex nonlinear behaviours have been shown from the single neuron level to the system level. For decades, linear connectivity analysis methods, such as correlation, coherence and Granger causality, have been extensively used to assess the neural connectivities and input-output interconnections in neural systems. Recent studies indicate that these linear methods can only capture a certain amount of neural activities and functional relationships, and therefore cannot describe neural behaviours in a precise or complete way. In this review, we highlight recent advances in nonlinear system identification of neural systems, corresponding time and frequency domain analysis, and novel neural connectivity measures based on nonlinear system identification techniques. We argue that nonlinear modelling and analysis are necessary to study neuronal processing and signal transfer in neural systems quantitatively. These approaches can hopefully provide new insights to advance our understanding of neurophysiological mechanisms underlying neural functions. These nonlinear approaches also have the potential to produce sensitive biomarkers to facilitate the development of precision diagnostic tools for evaluating neurological disorders and the effects of targeted intervention.

... However, implementing the cross-validation requires additional samples for performance validation and is generally time-consuming, which limits the practicability of BCI systems, to some extent. Bayesian inference provides an elegant way to circumvent this issue by exploiting a properly designed prior distribution [56], [57]. As a typical method, sparse Bayesian learning-based algorithms [18], [58], [59], [60] have been developed for automatic optimization of model hyperparameters, by exploiting a sparsity-induced prior, such as automatic relevance determination or Laplace distribution. ...

Accurate electroencephalogram (EEG) pattern decoding for specific mental tasks is one of the key steps for the development of brain-computer interface (BCI), which is quite challenging due to the considerably low signal-to-noise ratio of EEG collected at the brain scalp. Machine learning provides a promising technique to optimize EEG patterns toward better decoding accuracy. However, existing algorithms do not effectively explore the underlying data structure capturing the true EEG sample distribution, and hence can only yield a suboptimal decoding accuracy. To uncover the intrinsic distribution structure of EEG data, we propose a clustering-based multi-task feature learning algorithm for improved EEG pattern decoding. Specifically, we perform affinity propagation-based clustering to explore the subclasses (i.e., clusters) in each of the original classes, and then assign each subclass a unique label based on a one-versus-all encoding strategy. With the encoded label matrix, we devise a novel multi-task learning algorithm by exploiting the subclass relationship to jointly optimize the EEG pattern features from the uncovered subclasses. We then train a linear support vector machine with the optimized features for EEG pattern decoding. Extensive experimental studies are conducted on three EEG datasets to validate the effectiveness of our algorithm in comparison with other state-of-the-art approaches. The improved experimental results demonstrate the outstanding superiority of our algorithm, suggesting its prominent performance for EEG pattern decoding in BCI applications.

... With the recent rapid advancement in machine learning (ML)-and deep learning (DL)-driven data science technologies, emerging new informatics tools are effective and efficient for MEG data processing and mining. For example, Bayesian inference was used in MEG signal processing and brain activity prediction [13], and supervised learning techniques have been incorporated into downstream data mining for MEG in neuropsychiatric and neurodegenerative disorders, such as Huntington's diseases, mild traumatic injury and bipolar disorders, yielding promising results [14][15][16]. We have shown the utility of a MLbased data mining pipeline in PTSD classification using MEG connectome data [17]. ...

Objective The present study explores the effectiveness of incorporating temporal information in predicting Post-Traumatic Stress Disorder (PTSD) severity using magnetoencephalography (MEG) imaging data. The main objective was to assess the relationship between longitudinal MEG functional connectome data, measured across a variety of neural oscillatory frequencies and collected at two-timepoints (Phase I & II), against PTSD severity captured at the later time point. Approach We used an in-house developed informatics solution, featuring a two-step process featuring pre-learn feature selection (CV-SVR-rRF-FS, cross-validation with support vector regression and recursive random forest feature selection) and deep learning (long-short term memory recurrent neural network, LSTM-RNN) techniques. Main results The pre-learn step selected a small number of functional connections (or edges) from Phase I MEG data associated with Phase II PTSD severity, indexed using the PTSD CheckList (PCL) score. This strategy identified the functional edges affected by traumatic exposure and indexed disease severity, either permanently or evolving dynamically over time, for optimal predictive performance. Using the selected functional edges, LSTM modelling was used to incorporate the Phase II MEG data into longitudinal regression models. Single timepoint (Phase I and Phase II MEG data) SVR models were generated for comparison. Assessed with holdout test data, alpha and high gamma bands showed enhanced predictive performance with the longitudinal models comparing to the Phase I single timepoint models. The best predictive performance was observed for lower frequency ranges compared to the higher frequencies (low gamma), for both model types. Significance This study identified the neural oscillatory signatures that benefited from additional temporal information when estimating the outcome of PTSD severity using MEG functional connectome data. Crucially, this approach can similarly be applied to any other mental health challenge, using this effective informatics foundation for longitudinal tracking of pathological brain states and predicting outcome with a MEG-based neurophysiology imaging system.

... Along this line, the L p norm iterative sparse solution (LPISS) [11] is an iterative sparse learning algorithm based on a L p norm. Sparse Bayesian learning approaches (SBL) [12]- [14] cast the inverse problem under a empirical Bayesian framework where hyperparameters can be automatically determined and sparse solutions can be obtained. In particular, Champagne [15], [16] is an SBL approach that estimates the number, location, and time course of the sources in a principled fashion. ...

Electromagnetic source imaging (ESI) is a highly ill-posed inverse problem. To find a unique solution, traditional ESI methods impose a variety of priors that may not reflect the actual source properties. Such limitations of traditional ESI methods hinder their further applications. Inspired by deep learning approaches, a novel data-synthesized spatio-temporal denoising autoencoder method (DST-DAE) method was proposed to solve the ESI inverse problem. Unlike the traditional methods, we utilize a neural network to directly seek generalized mapping from the measured E/MEG signals to the cortical sources. A novel data synthesis strategy is employed by introducing the prior information of sources to the generated large-scale samples using the forward model of ESI. All the generated data are used to drive the neural network to automatically learn inverse mapping. To achieve better estimation performance, a denoising autoen-coder (DAE) architecture with spatio-temporal feature extraction blocks is designed. Compared with the traditional methods, we show (1) that the novel deep learning approach provides an effective and easy-to-apply way to solve the ESI problem, that (2) compared to traditional methods, DST-DAE with the data synthesis strategy can better consider the characteristics of real sources than the mathematical formulation of prior assumptions, and that (3) the specifically designed architecture of DAE can not only provide a better estimation of source signals but also be robust to noise pollution. Extensive numerical experiments show that the proposed method is superior to the traditional knowledge-driven ESI methods.

... The rapid advancement of ML-based data mining approaches has shown promise in neuroradiology for the assessment of large, multidimensional data sets. For example, various Bayesian inference-based ML algorithms have been developed for neuroimaging signal processing, and supervised learning methods have been used as informatics tools for data mining (Wu et al. 2016). In the context of translational research and clinical applications, these methods are actively explored for diagnosis, prognostication, and intervention efficacy. ...

Objective: Mild traumatic brain injury (mTBI) is impossible to detect using standard neuroradiological assessment such as structural magnetic resonance imaging (MRI). Injury does however disrupt the dynamic repertoire of neural activity indexed by neural oscillations. In particular, beta oscillations are reliable predictors of cognitive, perceptual and motor system functioning, as well as correlate highly with underlying myelin architecture and brain connectivity - all factors particularly susceptible to dysregulation after mTBI. Methods: We measured local and large-scale neural circuit function using MEG (magnetoencephalography) with a data-driven model fit approach using the Fitting Oscillations & One-Over F algorithm, in a group of young adult males with mTBI and a matched healthy control group. We quantified band-limited regional power and functional connectivity between brain regions. Results: We found reduced regional power and deficits in functional connectivity across brain areas, which pointed to the well-characterized thalamocortical dysconnectivity associated with mTBI. Furthermore, our results suggested beta functional connectivity data reached the best mTBI classification performance when compared with regional power and symptom severity (measured using SCAT2, or Sport Concussion Assessment Tool 2). Conclusions: The current study revealed the relevance of beta oscillations as a window into neurophysiological dysfunction in mTBI, and also highlights the reliability of neural synchrony biomarkers in disorder classification.

... The most common neuroimaging modality that is employed in BCIs is the electroencephalography, a typically non-invasive neuroimaging technology that measures the brain's electrical activity using electrodes placed on the human scalp. The produced recording, called electroencephalogram (EEG), is not easy to interpret as it has a low signal to noise ratio and its statistical properties change substantially with the course of time [1]. Moreover, EEG is known to vary significantly across individuals and even to depend on subject's state during the recording. ...

Electroencephalography signals inherently deviate from the notion of regular spatial sampling, as they reflect the coordinated action from multiple distributed overlapping cortical networks. Hence, the observed brain dynamics are influenced both by the topology of the sensor array and the underlying functional connectivity. Neural engineers are currently exploiting the advances in the domain of graph signal processing in an attempt to create robust and reliable brain decoding systems. In this direction, Geometric Deep Learning is a highly promising concept for combining the benefits of graph signal processing and deep learning towards revolutionizing Brain-Computer Interfaces (BCIs). However, its exploitation has been hindered by its data-demanding character. As a remedy, we propose here a novel data augmentation approach that combines the multiplex network modelling of multichannel signal with a graph variant of the classical Empirical Mode Decomposition (EMD), and which proves to be a strong asset when combined with Graph Convolutional Neural Networks (GCNNs). As our graph-EMD algorithm makes no assumptions with respect to linearity and stationarity, it appears as an appealing solution towards analyzing brain signals without artificially imposing regularities in either temporal or spatial domain. Our experimental results indicate that the proposed scheme for data augmentation leads to substantial improvement when it is combined with GCNNs. Using recordings from two distinct BCI applications and comparing against a state-of-the-art augmentation method, we illustrate the benefits from its use. By making it available to BCI community, we hope to further foster the application of geometric deep learning in the field.

... Nonparametric Bayesian approaches like Gaussian process (GP) is closely related to ANN. GP has recently been used for system identification purpose [88][89][90] and applied to analyse neurophysiological signals [91], such as the use of GP modelling for EEG-based seizure detection and prediction [92] and heteroscedastic modelling of noisy highdimensional MEG data [93]. Compared with ANN, GP can be applied to model datasets with small sample size and it has a relatively small number of hyperparameters. ...

The human nervous system is one of the most complicated systems in nature. The complex nonlinear behaviours have been shown from the single neuron level to the system level. For decades, linear connectivity analysis methods, such as correlation, coherence and Granger causality, have been extensively used to assess the neural connectivities and input-output interconnections in neural systems. Recent studies indicate that these linear methods can only capture a certain amount of neural activities and functional relationships, and therefore cannot describe neural behaviours in a precise or complete way. In this review, we highlight recent advances on nonlinear system identification of neural systems, corresponding time and frequency domain analysis, and novel neural connectivity measures based on nonlinear system identification techniques. We argue that nonlinear modelling and analysis are necessary to study neuronal processing and signal transfer in neural systems quantitatively. These approaches can hopefully provide new insights to advance our understanding of neurophysiological mechanisms underlying neural functions. They also have the potential to produce sensitive biomarkers to facilitate the development of precision diagnostic tools for evaluating neurological disorders and the effects of targeted intervention.

... A few number of features are computed on multiple channels to capture the inter-channel relations, e.g., the asymmetry of PSD between two hemispheres [7] and functional connectivity [29], [30], where common indices such as correlation, coherence and phase synchronization were used estimate brain functional connectivity between channels. Another line of research in multi-channel features is to use common spatial filters [31] and spatial-temporal filters [32], [33] to extract class-discriminative EEG features. In contrast, our model is deigned to operate on single-channel features and learn to effectively combine them using a graph neural network. ...

Electroencephalography (EEG) measures the neuronal activities in different brain regions via electrodes. Many existing studies on EEG-based emotion recognition do not fully exploit the topology of EEG channels. In this paper, we propose a regularized graph neural network (RGNN) for EEG-based emotion recognition. RGNN considers the biological topology among different brain regions to capture both local and global relations among different EEG channels. Specifically, we model the inter-channel relations in EEG signals via an adjacency matrix in a graph neural network where the connection and sparseness of the adjacency matrix are inspired by neuroscience theories of human brain organization. In addition, we propose two regularizers, namely node-wise domain adversarial training (NodeDAT) and emotion-aware distribution learning (EmotionDL), to better handle cross-subject EEG variations and noisy labels, respectively. Extensive experiments on two public datasets, SEED and SEED-IV, demonstrate the superior performance of our model than state-of-the-art models in most experimental settings. Moreover, ablation studies show that the proposed adjacency matrix and two regularizers contribute consistent and significant gain to the performance of our RGNN model. Finally, investigations on the neuronal activities reveal important brain regions and inter-channel relations for EEG-based emotion recognition.

... To resolve this issue, we imposed a sparsity constraint on the estimate and employed a variational Bayes (VB) linear regression combined with an automatic relevance determination (ARD) procedure (Bishop, 2006;Wu, Nagarajan, & Chen, 2016;Drugowtisch, 2017). The VB-ARD sparse regression showed improved decoding performance in the presence of high-dimensionality (see the results in section 3.2). ...

Large-scale fluorescence calcium imaging methods have become widely adopted for studies of long-term hippocampal and cortical neuronal dynamics. Pyramidal neurons of the rodent hippocampus show spatial tuning in freely foraging or head-fixed navigation tasks. Development of efficient neural decoding methods for reconstructing the animal's position in real or virtual environments can provide a fast readout of spatial representations in closed-loop neuroscience experiments. Here, we develop an efficient strategy to extract features from fluorescence calcium imaging traces and further decode the animal's position. We validate our spike inference-free decoding methods in multiple in vivo calcium imaging recordings of the mouse hippocampus based on both supervised and unsupervised decoding analyses. We systematically investigate the decoding performance of our proposed methods with respect to the number of neurons, imaging frame rate, and signal-to-noise ratio. Our proposed supervised decoding analysis is ultrafast and robust, and thereby appealing for real-time position decoding applications based on calcium imaging.

... Rapid advancement in artificial intelligence and machine learning have shown promise in brain imaging and computational neuroscience. Various Bayesian inference-based machine learning algorithms have been developed and implemented for neuroimaging signal processing and temporal brain activity prediction 16 . In translational research and clinical applications, these methods are being actively explored for pre-symptomatic diagnosis, prognostic prediction, and medical intervention effectiveness prediction 17 . ...

Given the subjective nature of conventional diagnostic methods for post-traumatic stress disorder (PTSD), an objectively measurable biomarker is highly desirable; especially to clinicians and researchers. Macroscopic neural circuits measured using magnetoencephalography (MeG) has previously been shown to be indicative of the PTSD phenotype and severity. In the present study, we employed a machine learning-based classification framework using MEG neural synchrony to distinguish combat- related PTSD from trauma-exposed controls. Support vector machine (SVM) was used as the core classification algorithm. A recursive random forest feature selection step was directly incorporated in the nested SVM cross validation process (CV-SVM-rRF-FS) for identifying the most important features for PTSD classification. For the five frequency bands tested, the CV-SVM-rRF-FS analysis selected the minimum numbers of edges per frequency that could serve as a PTSD signature and be used as the
basis for SVM modelling. Many of the selected edges have been reported previously to be core in PTSD pathophysiology, with frequency-specific patterns also observed. Furthermore, the independent partial least squares discriminant analysis suggested low bias in the machine learning process. The final SVM models built with selected features showed excellent PTSD classification performance (area-under- curve value up to 0.9). Testament to its robustness when distinguishing individuals from a heavily traumatised control group, these developments for a classification model for PTSD also provide a comprehensive machine learning-based computational framework for classifying other mental health challenges using MEG connectome profiles.

... However, deep neural network-based EEG classification techniques are also in trends that require high-capacity computer systems for performing signal analysis. Wu W et al. [49], in their researches, proposed Bayesian machine learning-based EEG/MEG data analysis. This is a deep learning-based model that can be implemented to replace the two-stage process into one for faster motor imagery classification. ...

Identification of imagined motor movement during cerebral activity is a significant task for translating the activity into control signals for brain–computer interface-based applications. This paper proposes a model based on non-dyadic wavelet decomposition for specifying left-hand and right-hand movement detection using motor imagery EEG signals. Three key components define our model: (1) The preprocessing and non-dyadic wavelet decomposition of EEG signals, (2) feature extraction using the common spatial pattern (CSP) coefficients of wavelet decomposed signals, (3) classification of test signal using selected features. The wavelet decomposition is done on the basis of m-band filtering. Classification of extracted features is done using different classifiers and obtained results are compared in terms of sensitivity, specificity, accuracy, and the kappa value. The proposed model gives the highest classification accuracy of 85.6% using decision tree classifier for BCI Competition 4 dataset IIa and BCI Competition 3 dataset IVa.

... The first is a smearing of signal and noise resulting from volume conduction due to each electrode picking up neural signals from multiple sources, and adjacent electrodes detecting neural signals from the same sources 14 . Second, there is a risk of overfitting the model given the high spatiotemporal dimensionality and noisiness of EEG data 15,16 . Third, there are challenges in simultaneously optimizing feature identification and fitting of predictive regression models due to the nonlinearity of the error function with respect to the model parameters 17 . ...

Antidepressants are widely prescribed, but their efficacy relative to placebo is modest, in part because the clinical diagnosis of major depression encompasses biologically heterogeneous conditions. Here, we sought to identify a neurobiological signature of response to antidepressant treatment as compared to placebo. We designed a latent-space machine learning algorithm tailored for resting-state electroencephalography (rsEEG) and applied it to data from the largest imaging-coupled, placebo-controlled antidepressant study (n=309). Symptom improvement was robustly predicted in a manner both specific for the antidepressant sertraline (versus placebo) and generalizable across different study sites and EEG equipment. This sertraline-predictive EEG signature generalized to two depression samples, wherein it reflected general antidepressant medication responsivity, and related differentially to repetitive transcranial magnetic stimulation (rTMS) treatment outcome. Furthermore, we found that the sertraline rsEEG signature indexed prefrontal neural responsivity, as measured by concurrent TMS/EEG. Our findings advance the neurobiological understanding of antidepressant treatment through an EEG-tailored computational model and provide a clinical avenue for personalized treatment of depression.

... Support vector machine (SVM) is being used widely in P300 based BCI systems [31]. Other classifiers such as the Bayseian Machine learning [32], Non-linear Bayesian classifiers, and k nearest neighbor (kNN), are also used in P300 detection [33]. But non-linear methods are not preferable in P300 based BCI systems as much as linear classifiers. ...

P300 signal is an endogenous event related potential component. It is mostly elicited from the frontal to parietal brain lobes. Electroencephalography is used for acquiring P300 signal from scalp. P300 signal is used for brain-computer interface systems. P300 based brain-computer interface systems are preferable since they have high overall performance. The most significant overall performance indicator is information transfer rate for P300 based brain-computer interface systems. P300 signal detection accuracy and P300 detection time are using for information transfer rate calculation. Hence, P300 signal classification accuracy is important for getting higher information transfer rate. In this study, it is aimed to investigate P300 detection model for higher classification accuracy. Thus, it is proposed 3-dimensional input convolutional neural network model for P300 detection. Moreover, the proposed model was applied with region based P300 speller which constituted audio and visual stimuli. In experiments, the participants were asked to spell desired words in two sessions which were offline and online session. Linear support vector machine, stepwise linear discriminant analysis, 2-dimensional input convolutional neural network, and the proposed method were compared in both online and offline sessions. It is reached highest average classification accuracy rate with the proposed method in both sessions. According to the online session result, average classification accuracy was 94.22% in 3-dimensional input convolutional neural network model. Furthermore, average information transfer rate was 5.53 bit/min in 3-dimensional input convolutional neural network model. We have also applied methods on BCI competition III-dataset II for 2 participants “A” and “B” for evaluating performance of algorithms. The proposed method had higher classification accuracy rate than linear support vector machine, stepwise linear discriminant analysis, 2-dimensional input convolutional neural network, and multi-classifier convolutional neural network which was used in other study on same dataset.

The introduction of the Internet of Things has led to the connectivity of millions of devices with less human interaction. This demand in connectivity has resulted in a surge in network attacks as IoT is susceptible to several cyberattacks. Due to their resource-constrained nature, traditional security mechanisms are inappropriate for securing IoT systems. Hence, the need for pervasive security mechanisms that are robust to mitigate attacks and secure IoT networks. One of the emerging potential solutions to network security is Machine Learning (ML). Recently, ML has been applied to mitigate cybersecurity threats in Cyber-Physical Systems (CPS). This paper presents a hybrid ML model for the efficient and effective detection of anomalies in IoT systems. The proposed model combines Random Forest algorithm, XGB, KNN and two decision tress with equal weights assigned to enhance the detection of anomalies in IoT systems. Experimental results show that the proposed hybrid model achieves a higher misbehaviour detection rate when compared to the other ML models in terms of accuracy, precision, recall and f1-score

Accurate reconstruction of the brain activities from electroencephalography and magnetoencephalography (E/MEG) remains a long-standing challenge for the intrinsic ill-posedness in the inverse problem. In this study, to address this issue, we propose a novel data-driven source imaging framework based on sparse Bayesian learning and deep neural network (SI-SBLNN). Within this framework, the variational inference in conventional algorithm, which is built upon sparse Bayesian learning, is compressed via constructing a straightforward mapping from measurements to latent sparseness encoding parameters using deep neural network. The network is trained with synthesized data derived from the probabilistic graphical model embedded in the conventional algorithm. We achieved a realization of this framework with the algorithm, source imaging based on spatio-temporal basis function (SI-STBF), as backbone. In numerical simulations, the proposed algorithm validated its availability for different head models and robustness against distinct intensities of the noise. Meanwhile, it acquired superior performance compared to SI-STBF and several benchmarks in a variety of source configurations. Additionally, in real data experiments, it obtained the concordant results with the prior studies.

With the emergence of the covid 19 pandemic, E-learning usage was the only way to solve the problem of study interruption in educational institutions and universities. Therefore, this field reserved significant attention in current times. In this paper, we used ten Machine Learning (ML) algorithms: Decision Tree(DT), Random Forest(RF), Logistic Regression(LR), SGD Classifier, Multinomial NB, K- Nearest Neighbors Classifier(KNN), Ridge Classifier, Nearest Centroid, Complement NB and Bernoulli NB) to build a prediction system based on artificial intelligence techniques to predict the difficulties students face in using the e-learning management system, to support related decision-making. Which, in turn, contributes supporting the sustainable development of technology at the university. From the results obtained, we detect the important factors that affect the use of E-learning to solve students’ learning difficulties using LMS by building a prediction system based on AI techniques.KeywordsMachine learningE-learningStudent’s performanceEducational data mining

Simultaneously estimating brain source activity and noise has long been a challenging task in electromagnetic brain imaging using magneto- and electroencephalography. The problem is challenging not only in terms of solving the NP-hard inverse problem of reconstructing unknown brain activity across thousands of voxels from a limited number of sensors, but also for the need to simultaneously estimate the noise and interference. We present a generative model with an augmented leadfield matrix to simultaneously estimate brain source activity and sensor noise statistics in electromagnetic brain imaging (EBI).
We then derive three Bayesian inference algorithms for this generative model (expectation-maximization (EBI-EM), convex bounding (EBI-Convex) and fixed-point (EBI-Mackay)) to simultaneously estimate the hyperparameters of the prior distribution for brain source activity and sensor noise. A comprehensive performance evaluation for these three algorithms is performed. Simulations consistently show that the performance of EBI-Convex and EBI-Mackay updates is superior to that of EBI-EM. In contrast to the EBI-EM algorithm, both EBI-Convex and EBI-Mackay updates are quite robust to initialization, and are computationally efficient with fast convergence in the presence of both Gaussian and real brain noise. We also demonstrate that EBI-Convex and EBI-Mackay update algorithms can reconstruct complex brain activity with only a few trials of sensor data, and for resting-state data, achieving significant improvement in source reconstruction and noise learning for electromagnetic brain imaging.

The communication between the human brain and the external devices can be established using Electroencephalograms (EEG)-based Brain–Computer Interface by converting the neural activities of the brain into electric signals. The EEG signals were isolated into an energy–frequency–time spectrum with Hilbert Huang transform that was used by the Deep Learning (DL)-based model to learn discriminative spectro-temporal patterns of the raw EEG signals of ten digits. This paper has two major contributions: first, create a novel dataset known as BrainDigiData of EEG signals of ten digits from (0–9) using a multi-channel EEG device. Second to propose a DL-based one-dimensional Convolutional neural network model BrainDigiCNN to classify the BrainDigiData of EEG signals of digits. The publicly available Mind Big Dataset (MBD) of digits was also used to evaluate the performance of the proposed model. The research done in this paper showed that the band-wise analysis of EEG signals in a complex scenario resulted in improved results as compared to the scenario used in the previously existing work for digit classification using EEG signals. The proposed BrainDigiCNN model achieved the highest average accuracy of 96.99%. The average classification accuracy of 98.27% was achieved for the MBD dataset of 14 channel device EMOTIV EPOC+ and 89.62% on the MBD dataset of 5-channel EMOTIV Insight. The statistical analysis of the proposed model on traditional Machine Learning (ML) classifiers using paired t-test resulted in a p-value less than 0.05 which shows the significant difference between the proposed model and ML classifiers.

Dementia is a group of symptoms caused by neurodegenerative disease. It is characterized by impairment in memory, reasoning, behavior, and the capability to perform everyday activities. Worldwide, 50 million people have dementia, and nearly 10 million new dementia cases occur each year. Dementia is a significant reason for disability and dependency in late life. Dementia has a physical, psychological, social, and economic influence on dementia patients and their careers, families, and society. Therefore, there is a need for automated early dementia diagnosis that has cognitive as well as electroencephalogram (EEG) components. State-of-the-art methods have been proposed for efficient dementia diagnosis using machine learning (ML) and deep learning (DL) algorithms with imaging data. Usually, imaging diagnosis misses the early signs of neurodegenerative disease; however, these signs are clearly visible in a psychophysiological experiment. Datasets for dementia diagnosis using cognitive tasks are limited, but some recent research has shown significant results using different cognitive tests. Many other EEG-based ML techniques have achieved good accuracy in early dementia diagnosis, but there is still no final solution. This chapter summarizes all the work done to date for dementia diagnosis based on EEG and cognitive task data and compares various ML approaches used in this regard. It also summarizes different ML approaches with advanced EEG signal processing that can guide future researchers, practitioner, and technicians.

Accurate reconstruction of cortical activation from electroencephalography and magnetoencephalography (E/MEG) is a long-standing challenge because of the inherently ill-posed inverse problem. In this paper, a novel algorithm under the empirical Bayesian framework, source imaging with smoothness in spatial and temporal domains (SI-SST), is proposed to address this issue. In SI-SST, current sources are decomposed into the product of spatial smoothing kernel, sparseness encoding coefficients, and temporal basis functions (TBFs). Further smoothness is integrated in the temporal domain with the employment of an underlying autoregressive model. Because sparseness encoding coefficients are constructed depending on overlapped clusters over cortex in this model, we derived a novel update rule based on fixed-point criterion instead of the convexity based approach which becomes invalid in this scenario. Entire variables and hyper parameters are updated alternatively in the variational inference procedure. SI-SST was assessed by multiple metrics with both simulated and experimental datasets. In practice, SI-SST had the superior reconstruction performance in both spatial extents and temporal profiles compared to the benchmarks.

The Electroencephalogram (EEG) signal, as a data carrier that can contain a large amount of information about the human brain in different states, is one of the most widely used metrics for assessing human psychophysiological states. Among a variety of analysis methods, deep learning, especially convolutional neural network (CNN), has achieved remarkable results in recent years as a method to effectively extract features from EEG signals. Although deep learning has the advantages of automatic feature extraction and effective classification, it also faces difficulties in network structure design and requires an army of prior knowledge. Automating the design of these hyperparameters can therefore save experts' time and manpower. Neural architecture search techniques have thus emerged. In this paper, based on an existing gradient-based NAS algorithm, PC-DARTS, with targeted improvements and optimizations for the characteristics of EEG signals. Specifically, we establish the model architecture step by step based on the manually designed deep learning models for EEG discrimination by retaining the framework of the search algorithm and performing targeted optimization of the model search space. Corresponding features are extracted separately according to the frequency domain, time domain characteristics of the EEG signal and the spatial position of the EEG electrode. The architecture was applied to EEG-based emotion recognition and driver drowsiness assessment tasks. The results illustrate that compared with the existing methods, the model architecture obtained in this paper can achieve competitive overall accuracy and better standard deviation in both tasks. Therefore, this approach is an effective migration of NAS technology into the field of EEG analysis and has great potential to provide high-performance results for other types of classification and prediction tasks. This can effectively reduce the time cost for researchers and facilitate the application of CNN in more areas.

Pain is a dynamic, complex and multidimensional experience. The identification of pain from brain activity as neural readout may effectively provide a neural code for pain, and further provide useful information for pain diagnosis and treatment. Advances in neuroimaging and large-scale electrophysiology have enabled us to examine neural activity with improved spatial and temporal resolution, providing opportunities to decode pain in humans and freely behaving animals. This topical review provides a systematical overview of state-of-the-art methods for decoding pain from brain signals, with special emphasis on electrophysiological and neuroimaging modalities. We show how pain decoding analyses can help pain diagnosis and discovery of neurobiomarkers for chronic pain. Finally, we discuss the challenges in the research field and point to several important future research directions.

A rapidly aging population worldwide has spurred interest in developing new strategies to cope with neural declines and neurodegenerative disorders. Noninvasive brain stimulation (NIBS) is increasingly being used to explore functional mechanisms of the brain and induce the therapeutic modulation of behavior, cognition, and emotion. Galvanic vestibular stimulation (GVS), a safe and well-tolerated NIBS technique, is capable of modulating activity in various cortical and subcortical areas involved in vestibular and multisensory processing. A key facet of GVS is that the resultant effects may, in part, be a function of the individual being treated and the stimulus waveform that is delivered. Yet, most GVS studies have utilized the same generic stimulus, chosen from a reduced repertoire of candidates, across all subjects. The future use and, ultimately, clinical adoption of this technology will rely on contributions from the signal processing community to customize stimuli that are optimized for their effect and to exert maximum influence on brain imaging biomarkers. We provide a signal processing-focused overview of the current GVS state of the art in neurorehabilitation, including general stimulation design, concurrent analysis with neuroimaging data, and suggestions for future directions.

Multimodal functional neuroimaging by integrating functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) has the promise of recovering brain activities with high spatiotemporal resolution, which is crucial for neuroscience research and clinical diagnosis. However, the misalignment of the localizations between fMRI and EEG activities may degrade the accuracy of the fMRI-constrained EEG source imaging (ESI) technique. To leverage the complementary spatiotemporal resolution of fMRI and EEG in a data-driven
fashion, we propose an asymmetric approach for EEG/fMRI fusion, termed fMRI-informed source imaging based on spatiotemporal basis functions (fMRISI-STBF). fMRI-SI-STBF employs the covariance components (CCs) derived from clusters defined by fMRI and EEG signals as spatial priors within the empirical Bayesian framework. Additionally, fMRI-SI-STBF represents the current source matrix as a linear combination of several unknown temporal basis functions (TBFs) by matrix decomposition. The relative contribution of each of the fMRI-informed and EEG-informed CCs, as well as the number and profiles of the TBFs, are all automatically determined based on the EEG data using variational Bayesian inference. Our results demonstrate that fMRI-SI-STBF can effectively utilize valid fMRI information for ESI and is robust to invalid fMRI priors. This robustness is essential for practical ESI since the validity of fMRI priors is often unclear considering that fMRI is an indirect measure of the neural activity. Moreover, fMRI-SI-STBF can achieve performance improvement by incorporating temporal constraints compared to methods that use spatial constraints only. Under all the simulated conditions, fMRI-SI-STBF reconstructs the source extents, locations and time courses more accurately than the EEG-fMRI ESI methods (i.e., fwMNE, fMRI-SI-SBF) and ESI methods without fMRI priors (i.e., wMNE, LORETA, SBL, SI-STBF, SI-SBF), indicated by the smaller spatial dispersion (average SD < 5 mm), distance of localization error (average DLE < 2 mm), shape error (average SE < 0.9) and larger model evidence values.

Brain sign can be gotten and investigated utilizing an assortment of techniques as depicted in this writing survey. Understanding the conceivable outcomes of systematic techniques extends specialists' points of view for creating mechanical ways to deal with recognizing organic occasions. In particular, EEG sign can be broke down utilizing an assortment of strategies, proposing a mix of techniques could be ideal for simplicity of automated examination and finding of epileptic seizures.

Accurate electroencephalogram (EEG) pattern decoding for specific mental tasks is one of the key steps for the development of brain-computer interface (BCI), which is quite challenging due to the considerably low signal-to-noise ratio of EEG collected at the brain scalp. Machine learning provides a promising technique to optimize EEG patterns toward better decoding accuracy. However, existing algorithms do not effectively explore the underlying data structure capturing the true EEG sample distribution and, hence, can only yield a suboptimal decoding accuracy. To uncover the intrinsic distribution structure of EEG data, we propose a clustering-based multitask feature learning algorithm for improved EEG pattern decoding. Specifically, we perform affinity propagation-based clustering to explore the subclasses (i.e., clusters) in each of the original classes and then assign each subclass a unique label based on a one-versus-all encoding strategy. With the encoded label matrix, we devise a novel multitask learning algorithm by exploiting the subclass relationship to jointly optimize the EEG pattern features from the uncovered subclasses. We then train a linear support vector machine with the optimized features for EEG pattern decoding. Extensive experimental studies are conducted on three EEG data sets to validate the effectiveness of our algorithm in comparison with other state-of-the-art approaches. The improved experimental results demonstrate the outstanding superiority of our algorithm, suggesting its prominent performance for EEG pattern decoding in BCI applications.

Effectively extracting common space pattern (CSP) features from motor imagery (MI) EEG signals is often highly dependent on the filter band selection. At the same time, optimizing the EEG channel combinations is another key issue that substantially affects the SMR feature representations. Although numerous algorithms have been developed to find channels that record important characteristics of MI, most of them select channels in a cumbersome way with low computational efficiency, thereby limiting the practicality of MI-based BCI systems. In this study, we propose the multi-scale optimization (MSO) of spatial patterns, optimizing filter bands over multiple channel sets within CSPs to further improve the performance of MI-based BCI. Specifically, several channel subsets are first heuristically predefined, and then raw EEG data specific to each of these subsets bandpass-filtered at the overlap between a set of filter bands. Further, instead of solving learning problems for each channel subset independently, we propose a multi-view learning based sparse optimization to jointly extract robust CSP features with L
<sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2,1</sub>
-norm regularization, aiming to capture the shared salient information across multiple related spatial patterns for enhanced classification performance. A support vector machine (SVM) classifier is then trained on these optimized EEG features for accurate recognition of MI tasks. Experimental results on three public EEG datasets validate the effectiveness of MSO compared to several other competing methods and their variants. These superior experimental results demonstrate that the proposed MSO method has promising potential in MI-based BCIs.

In this paper, we develop a robust sliding-mode nonlinear predictive controller for brain-controlled robots with enhanced performance, safety, and robustness. First, the kinematics and dynamics of a mobile robot are built. After that, the proposed controller is developed by cascading a predictive controller and a smooth sliding mode controller. The predictive controller integrates the human intention tracking with safety guarantee objectives into an optimization problem to minimize the invasion to human intention while maintaining the robot safety. The smooth sliding mode controller is designed to achieve robust desired velocity tracking. The results of human-in-the-loop simulation and robotic experiments both show the efficacy and robust performance of the proposed controller. This work provides an enabling design to enhance the future research and development of brain-controlled robots.

Breast cancer is type of tumor that occurs in the tissues of the breast. It is most common type of cancer found in women around the world and it is among the leading causes of deaths in women. This paper presents the comparative analysis of machine learning, deep learning and data mining techniques being used for the prediction of breast cancer. Many researchers have put their efforts on breast cancer diagnoses and prognoses, every technique has different accuracy rate and it varies for different situations, tools and datasets being used. Our main focus is to comparatively analyze different existing Machine Learning and Data Mining techniques in order to find out the most appropriate method that will support the large dataset with good accuracy of prediction. The main purpose of this review is to highlight all the previous studies of machine learning algorithms that are being used for breast cancer prediction and this paper provides the all necessary information to the beginners who want to analyze the machine learning algorithms to gain the base of deep learning.

Indirect current measurement method has played a pivotal role in power grid, owing to the development of inverse problem theory. In this paper, a general current source reconstruction method based on the current element model was theoretically proposed. The method discretized the unknown current source region and aimed at reconstructing the actual current distribution according to the synthetical characteristics of discretized current elements. An approximated linear calculation model was derived to construct the inverse relations between the current element and magnetic field distribution. To solve the illposed inverse problem, the elastic net regularization methods under non-Bayesian and Bayesian regimes were presented. The theoretical analysis was implemented under different noise levels. Four current distributions concerning the line current and volume current in practical applications were reconstructed by the two elastic net regularization methods. The results indicated that the methods were feasible in solving this ill-posed inverse problem, obtaining the current parameters and magnetic field distribution effectively. This novel regularization method can be utilized in the measurement of irregular current forms in electrical equipment, leading to an effective and robust approach in operation state acquisition and fault diagnosis of electrical equipment in power grid.

In this work, we propose a hierarchical latent dictionary approach to estimate the time-varying mean and covariance of a process for which we have only limited noisy sam-ples. We fully leverage the limited sample size and redundancy in sensor measurements by transferring knowledge through a hierar-chy of lower dimensional latent processes. As a case study, we utilize Magnetoencephalog-raphy (MEG) recordings of brain activity to identify the word being viewed by a human subject. Specifically, we identify the word category for a single noisy MEG recording, when only given limited noisy samples on which to train.

Common spatial patterns (CSP) is a well-known spatial filtering algorithm for multichannel electroencephalogram (EEG) analysis. In this paper, we cast the CSP algorithm in a probabilistic modeling setting. Specifically, probabilistic CSP (P-CSP) is proposed as a generic EEG spatio-temporal modeling framework that subsumes the CSP and regularized CSP algorithms. The proposed framework enables us to resolve the overfitting issue of CSP in a principled manner. We derive statistical inference algorithms that can alleviate the issue of local optima. In particular, an efficient algorithm based on eigendecomposition is developed for maximum a posteriori (MAP) estimation in the case of isotropic noise. For more general cases, a variational algorithm is developed for group-wise sparse Bayesian learning for the P-CSP model and for automatically determining the model size. The two proposed algorithms are validated on a simulated data set. Their practical efficacy is also demonstrated by successful applications to single-trial classifications of three motor imagery EEG data sets and by the spatio-temporal pattern analysis of one EEG data set recorded in a Stroop color naming task.

The MEG/EEG inverse problem is ill-posed, giving different source reconstructions depending on the initial assumption sets. Parametric Empirical Bayes allows one to implement most popular MEG/EEG inversion schemes (minimum norm, LORETA, etc.) within the same generic Bayesian framework. It also provides a cost-function in terms of the variational Free energy -an approximation to the marginal likelihood or evidence of the solution-. In this manuscript, we revisit the algorithm for MEG/EEG source reconstruction with a view to providing a didactic and practical guide. The aim is to promote and help standardize the development and consolidation of other schemes within the same framework. We describe the implementation in the Statistical Parametric Mapping (SPM) software package, carefully explaining each of its stages with help of a simple simulated data example. We focus on the Multiple Sparse Priors (MSP) model, which we compare with the well-known Minimum Norm and LORETA models, using the negative variational free energy for model comparison. The manuscript is accompanied by Matlab scripts to allow the reader to test and explore the underlying algorithm.

Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman's coalescent, Dirichlet diffusion trees and Wishart processes.

Telemonitoring of electroencephalogram (EEG) through wireless body-area
networks is an evolving direction in personalized medicine. Among various
constraints in designing such a system, three important constraints are energy
consumption, data compression, and device cost. Conventional data compression
methodologies, although effective in data compression, consumes significant
energy and cannot reduce device cost. Compressed sensing (CS), as an emerging
data compression methodology, is promising in catering to these constraints.
However, EEG is non-sparse in the time domain and also non-sparse in
transformed domains (such as the wavelet domain). Therefore, it is extremely
difficult for current CS algorithms to recover EEG with the quality that
satisfies the requirements of clinical diagnosis and engineering applications.
Recently, Block Sparse Bayesian Learning (BSBL) was proposed as a new method to
the CS problem. This study introduces the technique to the telemonitoring of
EEG. Experimental results show that its recovery quality is better than
state-of-the-art CS algorithms, and sufficient for practical use. These results
suggest that BSBL is very promising for telemonitoring of EEG and other
non-sparse physiological signals.

A Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest. When used in
conjunction with statistical techniques, the graphical model has several advantages for data analysis. One, because the model
encodes dependencies among all variables, it readily handles situations where some data entries are missing. Two, a Bayesian
network can be used to learn causal relationships, and hence can be used to gain understanding about a problem domain and
to predict the consequences of intervention. Three, because the model has both a causal and probabilistic semantics, it is
an ideal representation for combining prior knowledge (which often comes in causal form) and data. Four, Bayesian statistical
methods in conjunction with Bayesian networks offer an efficient and principled approach for avoiding the overfitting of data.
In this paper, we discuss methods for constructing Bayesian networks from prior knowledge and summarize Bayesian statistical
methods for using data to improve these models. With regard to the latter task, we describe methods for learning both the
parameters and structure of a Bayesian network, including techniques for learning with incomplete data. In addition, we relate
Bayesian-network methods for learning to techniques for supervised and unsupervised learning. We illustrate the graphical-modeling
approach using a real-world case study.

Automatic relevance determination (ARD) and the closely-related sparse Bayesian learning (SBL) framework are effective tools for pruning large numbers of irrelevant features leading to a sparse explanatory subset. However, popular up- date rules used for ARD are either difcult to extend to more general problems of interest or are characterized by non-ideal convergence properties. Moreover, it re- mains unclear exactly how ARD relates to more traditional MAP estimation-based methods for learning sparse representations (e.g., the Lasso). This paper furnishes an alternative means of expressing the ARD cost function using auxiliary func- tions that naturally addresses both of these issues. First, the proposed reformu- lation of ARD can naturally be optimized by solving a series of re-weighted '1 problems. The result is an efcient, extensible algorithm that can be implemented using standard convex programming toolboxes and is guaranteed to converge to a local minimum (or saddle point). Secondly, the analysis reveals that ARD is exactly equivalent to performing standard MAP estimation in weight space using a particular feature- and noise-dependent, non-factorial weight prior. We then demonstrate that this implicit prior maintains several desirable advantages over conventional priors with respect to feature selection. Overall these results suggest alternative cost functions and update procedures for selecting features and promot- ing sparse solutions in a variety of general situations. In particular, the method- ology readily extends to handle problems such as non-negative sparse coding and covariance component estimation.

Many practical methods for finding maximally sparse coefficient expansions involve solving a regression problem using a particular class of concave penalty functions. From a Bayesian perspective, this process is equivalent to maximum a posteriori (MAP) estimation using a sparsity-inducing prior distribution (Type I estimation). Using variational techniques, this distribution can always be conveniently expressed as a maximization over scaled Gaussian distributions modulated by a set of latent variables. Alternative Bayesian algorithms, which operate in latent variable space leveraging this variational representation, lead to sparse estimators reflecting posterior information beyond the mode (Type II estimation). Currently, it is unclear how the underlying cost functions of Type I and Type II relate, nor what relevant theoretical properties exist, especially with regard to Type II. Herein a common set of auxiliary functions is used to conveniently express both Type I and Type II cost functions in either coefficient or latent variable space facilitating direct comparisons. In coefficient space, the analysis reveals that Type II is exactly equivalent to performing standard MAP estimation using a particular class of dictionary-and noise-dependent, non-factorial coefficient priors. One prior (at least) from this class maintains several desirable advantages over all possible Type I methods and utilizes a novel, non-convex approximation to the ℓ 0 norm with most, and in certain quantifiable conditions all, local minima smoothed away. Importantly, the global minimum is always left unaltered unlike standard ℓ 1-norm relaxations. This ensures that any appropriate descent method is guaranteed to locate the maximally sparse solution.

The aim of this paper is to describe a simple procedure for
electromagnetic (EEG or MEG) source reconstruction, in the context of group
studies. This entails a simple extension of existing source reconstiruction
techniques based upon the inversion of hierarchical models. The extension
ensures that evoked or induced responses are reconstructed in the same subset of
sources, over subjects. Effectively, the procedure aligns the deployment of
reconstructed activity over subjects and increases, substantially, the detection
of differences between evoked or induced responses at the group or
between-subject level.

A Brain-Computer Interface (BCI) is a specific type of human-computer interface that enables the direct communication between human and computers by analyzing brain measurements. Oddball paradigms are used in BCI to generate event-related potentials (ERPs), like the P300 wave, on targets selected by the user. A P300 speller is based on this principle, where the detection of P300 waves allows the user to write characters. The P300 speller is composed of two classification problems. The first classification is to detect the presence of a P300 in the electroencephalogram (EEG). The second one corresponds to the combination of different P300 responses for determining the right character to spell. A new method for the detection of P300 waves is presented. This model is based on a convolutional neural network (CNN). The topology of the network is adapted to the detection of P300 waves in the time domain. Seven classifiers based on the CNN are proposed: four single classifiers with different features set and three multiclassifiers. These models are tested and compared on the Data set II of the third BCI competition. The best result is obtained with a multiclassifier solution with a recognition rate of 95.5 percent, without channel selection before the classification. The proposed approach provides also a new way for analyzing brain activities due to the receptive field of the CNN models.

Version brochée de la seconde édition de 2001 Cet ouvrage couvre l'approche dite bayésienne de l'inférence statistique et en particulier ses aspects décisionnels. Les bases de cette axiomatique (choix de l'a priori, décisions optimales, tests et régions de confiance) sont abordées en détail, ainsi que des ouvertures plus récentes de l'analyse bayésienne comme le choix de modèles, l'utilisation de méthodes numériques stochastiques d'approximation (MCMC), la théorie des lois non informatives (axiomes de Berger-Bernardo) et la relation à la théorie classique de l'admissibilité. Chaque chapitre est complété par une suite extensive d'exercices de difficulté croissante et par des notes bibliographiques sur les thèmes abordés. Ce livre peut être utilisé dans un programme de Master en Mathématiques appliquées, en Biométrie, en Économétrie ou dans tout autre programme faisant appel aux techniques quantitatives de traitement de l'information. Il ne nécessite comme préliminaire qu'un cours de base en théorie des probabilités et en statistique mathématique. Il peut également être utilisé par des étudiants en thèse ou des chercheurs confirmés en quête d'une méthodologie statistique efficace pour l'analyse de leur(s) modèle(s). Winner of the 2004 DeGroot Prize This paperback edition, a reprint of the 2001 edition, is a graduate-level textbook that introduces Bayesian statistics and decision theory. It covers both the basic ideas of statistical theory, and also some of the more modern and advanced topics of Bayesian statistics such as complete class theorems, the Stein effect, Bayesian model choice, hierarchical and empirical Bayes modeling, Monte Carlo integration including Gibbs sampling, and other MCMC techniques. It was awarded the 2004 DeGroot Prize by the International Society for Bayesian Analysis (ISBA) for setting "a new standard for modern textbooks dealing with Bayesian methods, especially those using MCMC techniques, and that it is a worthy successor to DeGroot's and Berger's earlier texts". oui

We describe an asymmetric approach to fMRI and MEG/EEG fusion in which fMRI data are treated as empirical priors on electromagnetic sources, such that their influence depends on the MEG/EEG data, by virtue of maximizing the model evidence. This is important if the causes of the MEG/EEG signals differ from those of the fMRI signal. Furthermore, each suprathreshold fMRI cluster is treated as a separate prior, which is important if fMRI data reflect neural activity arising at different times within the EEG/MEG data. We present methodological considerations when mapping from a 3D fMRI Statistical Parametric Map to a 2D cortical surface and thence to the covariance components used within our Parametric Empirical Bayesian framework. Our previous introduction of a canonical (inverse-normalized) cortical mesh also allows deployment of fMRI priors that live in a template space; for example, from a group analysis of different individuals. We evaluate the ensuing scheme with MEG and EEG data recorded simultaneously from 12 participants, using the same face-processing paradigm under which independent fMRI data were obtained. Because the fMRI priors become part of the generative model, we use the model evidence to compare (i) multiple versus single, (ii) valid versus invalid, (iii) binary versus continuous, and (iv) variance versus covariance fMRI priors. For these data, multiple, valid, binary, and variance fMRI priors proved best for a standard Minimum Norm inversion. Interestingly, however, inversion using Multiple Sparse Priors benefited little from additional fMRI priors, suggesting that they already provide a sufficiently flexible generative model.

Dynamic Causal Modelling (DCM) is an approach first introduced for the analysis of functional magnetic resonance imaging (fMRI) to quantify effective connectivity between brain areas. Recently, this framework has been extended and established in the magneto/encephalography (M/EEG) domain. DCM for M/EEG entails the inversion a full spatiotemporal model of evoked responses, over multiple conditions. This model rests on a biophysical and neurobiological generative model for electrophysiological data. A generative model is a prescription of how data are generated. The inversion of a DCM provides conditional densities on the model parameters and, indeed on the model itself. These densities enable one to answer key questions about the underlying system. A DCM comprises two parts; one part describes the dynamics within and among neuronal sources, and the second describes how source dynamics generate data in the sensors, using the lead-field. The parameters of this spatiotemporal model are estimated using a single (iterative) Bayesian procedure. In this paper, we will motivate and describe the current DCM framework. Two examples show how the approach can be applied to M/EEG experiments.

Objective:
Magnetoencephalography (MEG) dipole localization of epileptic spikes is useful in epilepsy surgery for mapping the extent of abnormal cortex and to focus intracranial electrodes. Visually analyzing large amounts of data produces fatigue and error. Most automated techniques are based on matching of interictal spike templates or predictive filtering of the data and do not explicitly include source localization as part of the analysis. This leads to poor sensitivity versus specificity characteristics. We describe a fully automated method that combines time-series analysis with source localization to detect clusters of focal neuronal current generators within the brain that produce interictal spike activity.
Methods:
We first use an ICA (independent components analysis) method to decompose the multichannel MEG data and identify those components that exhibit spike-like characteristics. From these detected spikes we then find those whose spatial topographies across the array are consistent with focal neural sources, and determine the foci of equivalent current dipoles and their associated time courses. We then perform a clustering of the localized dipoles based on distance metrics that takes into consideration both their locations and time courses. The final step of refinement consists of retaining only those clusters that are statistically significant. The average locations and time series from significant clusters comprise the final output of our method.
Results and significance:
Data were processed from 4 patients with partial focal epilepsy. In all three subjects for whom surgical resection was performed, clusters were found in the vicinity of the resectioned area.
Conclusions:
The presented procedure is promising and likely to be useful to the physician as a more sensitive, automated and objective method to help in the localization of the interictal spike zone of intractable partial seizures. The final output can be visually verified by neurologists in terms of both the location and distribution of the dipole clusters and their associated time series. Due to the clinical relevance and demonstrated promise of this method, further investigation of this approach is warranted.

We consider the problem of learning the structure of a pairwise graphical model over continuous and discrete variables. We present a new pairwise model for graphical models with both continuous and discrete variables that is amenable to structure learning. In previous work, authors have considered structure learning of Gaussian graphical models and structure learning of discrete models. Our approach is a natural generalization of these two lines of work to the mixed case. The penalization scheme involves a novel symmetric use of the group-lasso norm and follows naturally from a particular parameterization of the model. Supplementary materials for this article are available online.

Milestones in sparse signal reconstruction and compressive sensing can be understood in a probabilistic Bayesian context, fusing underdetermined measurements with knowledge about low level signal properties in the posterior distribution, which is maximized for point estimation. We review recent progress to advance beyond this setting. If the posterior is used as distribution to be integrated over instead of merely an optimization criterion, sparse estimators with better properties may be obtained, and applications beyond point reconstruction from fixed data can be served. We describe novel variational relaxations of Bayesian integration, characterized as well as posterior maximization, which can be solved robustly for very large models by algorithms unifying convex reconstruction and Bayesian graphical model technology. They excel in difficult real-world imaging problems where posterior maximization performance is often unsatisfactory.

Bayesian nonparametrics works - theoretically, computationally. The theory provides highly flexible models whose complexity grows appropriately with the amount of data. Computational issues, though challenging, are no longer intractable. All that is needed is an entry point: this intelligent book is the perfect guide to what can seem a forbidding landscape. Tutorial chapters by Ghosal, Lijoi and Prünster, Teh and Jordan, and Dunson advance from theory, to basic models and hierarchical modeling, to applications and implementation, particularly in computer science and biostatistics. These are complemented by companion chapters by the editors and Griffin and Quintana, providing additional models, examining computational issues, identifying future growth areas, and giving links to related topics. This coherent text gives ready access both to underlying principles and to state-of-the-art practice. Specific examples are drawn from information retrieval, NLP, machine vision, computational biology, biostatistics, and bioinformatics.

This monograph provides an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks. The application areas are chosen with the following three criteria in mind: (1) expertise or knowledge of the authors; (2) the application areas that have already been transformed by the successful use of deep learning technology, such as speech recognition and computer vision; and (3) the application areas that have the potential to be impacted significantly by deep learning and that have been experiencing research growth, including natural language and text processing, information retrieval, and multimodal information processing empowered by multi-task deep learning.

In this paper, we present an infinite hierarchical non-parametric Bayesian
model to extract the hidden factors over observed data, where the number of
hidden factors for each layer is unknown and can be potentially infinite.
Moreover, the number of layers can also be infinite. We construct the model
structure that allows continuous values for the hidden factors and weights,
which makes the model suitable for various applications. We use the
Metropolis-Hastings method to infer the model structure. Then the performance
of the algorithm is evaluated by the experiments. Simulation results show that
the model fits the underlying structure of simulated data.

Multi-subject electroencephalography (EEG) classification involves algorithm development for automatically categorizing brain waves measured from multiple subjects who undergo the same mental task. Common spatial patterns (CSP) or its probabilistic counterpart, PCSP, is a popular discriminative feature extraction method for EEG classification. Models in CSP or PCSP are trained on a subject-by-subject basis so that inter-subject information is neglected. In the case of multi-subject EEG classification, however, it is desirable to capture inter-subject relatedness in learning a model. In this paper we present a nonparametric Bayesian model for a multi-subject extension of PCSP where subject relatedness is captured by assuming that spatial patterns across subjects share a latent subspace. Spatial patterns and the shared latent subspace are jointly learned by variational inference. We use an infinite latent feature model to automatically infer the dimension of the shared latent subspace, placing Indian Buffet process (IBP) priors on our model. Numerical experiments on BCI competition III IVa and IV 2a dataset demonstrate the high performance of our method, compared to PCSP and existing Bayesian multi-task CSP models.

Multi-subject electroencephalography (EEG) classification involves the categorization of brain waves measured from multiple subjects, each of whom undergoes the same mental task. Common spatial patterns (CSP) or probabilistic CSP (PCSP) are widely used for extracting discriminative features from EEG, although they are trained on a subject-by-subject basis and inter-subject information is neglected. Moreover, the performance is degraded when only a few training samples are available for each subject. In this paper, we present a method for Bayesian CSP with Dirichlet process (DP) priors, where spatial patterns (corresponding to basis vectors) are simultaneously learned and clustered across subjects using variational Bayesian inference, which facilitates a flexible mixture model where the number of components are also learned. Spatial patterns in the same cluster share the hyperparameters of their prior distributions, which means information transfer is facilitated among subjects with similar spatial patterns. Numerical experiments using the BCI competition IV 2a dataset demonstrated the high performance of our method, compared with existing PCSP and Bayesian CSP methods with a single prior distribution.

In many cases, observed brain signals can be assumed as the linear mixtures of unknown brain sources/components. It is the task of blind source separation (BSS) to find the sources. However, the number of brain sources is generally larger than the number of mixtures, which leads to an underdetermined model with infinite solutions. Under the reasonable assumption that brain sources are sparse within a domain, e.g., in the spatial, time, or time-frequency domain, we may obtain the sources through sparse representation. As explained in this article, several other typical problems, e.g., feature selection in brain signal processing, can also be formulated as the underdetermined linear model and solved by sparse representation. This article first reviews the probabilistic results of the equivalence between two important sparse solutions?the 0-norm and 1-norm solutions. In sparse representation-based brain component analysis including blind separation of brain sources and electroencephalogram (EEG) inverse imaging, the equivalence is related to the recoverability of the sources. This article also focuses on the applications of sparse representation in brain signal processing, including components extraction, BSS and EEG inverse imaging, feature selection, and classification. Based on functional magnetic resonance imaging (fMRI) and EEG data, the corresponding methods and experimental results are reviewed.

We consider a class of adaptive MCMC algorithms using a Langevin-type proposal density. We state and prove regularity conditions for the convergence of these algorithms. In addition to these theoretical results we introduce a number of methodological innovations that can be applied much more generally. We assess the performance of these algorithms with simulation studies, including an example of the statistical analysis of a point process driven by a latent log-Gaussian Cox process.

Magnetoencephalography (MEG) is an important non-invasive method for studying activity within the human brain. Source localization methods can be used to estimate spatiotemporal activity from MEG measurements with high temporal resolution, but the spatial resolution of these estimates is poor due to the ill-posed nature of the MEG inverse problem. Recent developments in source localization methodology have emphasized temporal as well as spatial constraints to improve source localization accuracy, but these methods can be computationally intense. Solutions emphasizing spatial sparsity hold tremendous promise, since the underlying neurophysiological processes generating MEG signals are often sparse in nature, whether in the form of focal sources, or distributed sources representing large-scale functional networks. Recent developments in the theory of compressed sensing (CS) provide a rigorous framework to estimate signals with sparse structure. In particular, a class of CS algorithms referred to as greedy pursuit algorithms can provide both high recovery accuracy and low computational complexity. Greedy pursuit algorithms are difficult to apply directly to the MEG inverse problem because of the high-dimensional structure of the MEG source space and the high spatial correlation in MEG measurements. In this paper, we develop a novel greedy pursuit algorithm for sparse MEG source localization that overcomes these fundamental problems. This algorithm, which we refer to as the Subspace Pursuit-based Iterative Greedy Hierarchical (SPIGH) inverse solution, exhibits very low computational complexity while achieving very high localization accuracy. We evaluate the performance of the proposed algorithm using comprehensive simulations, as well as the analysis of human MEG data during spontaneous brain activity and somatosensory stimuli. These studies reveal substantial performance gains provided by the SPIGH algorithm in terms of computational complexity, localization accuracy, and robustness.

We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

Classifying electroencephalography (EEG) signals is an important step for proceeding EEG-based brain computer interfaces (BCI). Currently, kernel based methods such as the support vector machine (SVM) are considered the state-of-the-art methods for this problem. In this paper, we apply Gaussian process (GP) classification to binary discrimination of motor imagery EEG data. Compared with the SVM, GP based methods naturally provide probability outputs for identifying a trusted prediction which can be used for post-processing in a BCI. Experimental results show that the classification methods based on a GP perform similarly to kernel logistic regression and probabilistic SVM in terms of predictive likelihood, but outperform SVM and K-nearest neighbor (KNN) in terms of 0–1 loss class prediction error.

The data of interest are assumed to be represented as N-dimensional real vectors, and these vectors are compressible in some linear basis B, implying that the signal can be reconstructed accurately using only a small number M Lt N of basis-function coefficients associated with B. Compressive sensing is a framework whereby one does not measure one of the aforementioned N-dimensional signals directly, but rather a set of related measurements, with the new measurements a linear combination of the original underlying N-dimensional signal. The number of required compressive-sensing measurements is typically much smaller than N, offering the potential to simplify the sensing system. Let f denote the unknown underlying N-dimensional signal, and g a vector of compressive-sensing measurements, then one may approximate f accurately by utilizing knowledge of the (under-determined) linear relationship between f and g, in addition to knowledge of the fact that f is compressible in B. In this paper we employ a Bayesian formalism for estimating the underlying signal f based on compressive-sensing measurements g. The proposed framework has the following properties: i) in addition to estimating the underlying signal f, "error bars" are also estimated, these giving a measure of confidence in the inverted signal; ii) using knowledge of the error bars, a principled means is provided for determining when a sufficient number of compressive-sensing measurements have been performed; iii) this setting lends itself naturally to a framework whereby the compressive sensing measurements are optimized adaptively and hence not determined randomly; and iv) the framework accounts for additive noise in the compressive-sensing measurements and provides an estimate of the noise variance. In this paper we present the underlying theory, an associated algorithm, example results, and provide comparisons to other compressive-sensing inversion algorithms in the literature.

We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

Electroencephalograms (EEGs) are becoming increasingly important measurements of brain activi