ArticlePDF Available

Abstract

We propose an online learning algorithm for training a logistic regression model on nonstationary classification problems. The nonstationarity is captured by modelling the weights in a logistic regression classifier as evolving according to a first order Markov process. The weights are updated using the extended Kalman filter formalism and nonstationarities are tracked by inferring a time-varying state noise variance parameter. We describe an algorithm for doing this based on maximising the evidence of updated predictions. The algorithm is illustrated on a number of synthetic problems. I. Introduction This paper proposes an online learning algorithm for training a logistic regression model on nonstationary classification problems. By nonstationary 1 we mean that the statistics of each class may vary with time or, equivalently from the classification perspective, that the optimal decision boundary changes with time. The simplest online algorithm for training a logistic regression m...
200 400
0
0.1
q
t
0 200 400 600
0
0.2
0.4
0.6
0.8
t
<α>
0 100 200 300 400 500 600
−1
0
1
2
3
0 100 200 300 400 500 600
−1
0
1
2
0 100 200 300 400 500 600
−1
0
1
2
0 200 400 600
0
50
100
150
200
250
300
350
E
t
Nonstationary
Stationary
St. Descent
−2 −1 0 1 2 3 4 5 6 7
−2
−1
0
1
2
3
4
5
6
7
−2 −1 0 1 2 3 4 5 6
−2
−1
0
1
2
3
4
5
6
100 200 300
0
0.1
0.2
q
t
0 100 200 300
0
20
40
60
80
100
E
t
Nonstationary
Stationary
St. Descent
... In the following, the models exploited to define the two criteria used to decide when to query the labels are introduced. The two criteria are label uncertainty defined through logistic regression [25] and density-based criterion defined through Growing Gaussian Mixture Model (GGMM) [1]. ...
... In order to calculate the expression p(θ t |θ t−1 ), we must specify how the parameters change over time. Following [25], we assume no knowledge of the drifting distribution p(θ t |θ t−1 ). Thus, Eq. (15) can be eliminated by estimating p(y t |x t , D t−1 ) which is done as follows: ...
... 2) Handling of concept drift: In-non stationary setting, a variant version of (21) proposed in [25] is used. The situation is exactly the same as for the stationary case, except that the prior distribution is now N (µ t−1 , Σ t−1 + v t f ). ...
Article
Full-text available
Active learning (AL) is a promising way to efficiently build up training sets with minimal supervision. A learner deliberately queries specific instances to tune the classifier's model using as few labels as possible. The challenge for streaming is that the data distribution may evolve over time, and therefore the model must adapt. Another challenge is the sampling bias where the sampled training set does not reflect the underlying data distribution. In the presence of concept drift, sampling bias is more likely to occur as the training set needs to represent the whole evolving data. To tackle these challenges, we propose a novel bi-criteria AL (BAL) approach that relies on two selection criteria, namely, label uncertainty criterion and density-based criterion. While the first criterion selects instances that are the most uncertain in terms of class membership, the latter dynamically curbs the sampling bias by weighting the samples to reflect on the true underlying distribution. To design and implement these two criteria for learning from streams, BAL adopts a Bayesian online learning approach and combines online classification and online clustering through the use of online logistic regression and online growing Gaussian mixture models, respectively. Empirical results obtained on standard synthetic and real-world benchmarks show the high performance of the proposed BAL method compared with the state-of-the-art AL methods.
... On the one hand, this approach is highly interpretable and supports relatively straightforward analyses of the impacts of different behavioural variables on state transition times; on the other hand, this may come at a cost to predictive accuracy around state boundaries. We furthermore expect, given the challenge of overfitting with these datasets that motivated our left-right state constraint, that any comparable dynamic models with a continuous state space (e.g., Penny & Roberts, 1999) would be very difficult to suitably regularise. ...
Article
Full-text available
An emerging goal in neuroscience is tracking what information is represented in brain activity over time as a participant completes some task. While electroencephalography (EEG) and magnetoencephalography (MEG) offer millisecond temporal resolution of how activity patterns emerge and evolve, standard decoding methods present significant barriers to interpretability as they obscure the underlying spatial and temporal activity patterns. We instead propose the use of a generative encoding model framework that simultaneously infers the multivariate spatial patterns of activity and the variable timing at which these patterns emerge on individual trials. An encoding model inversion maps from these parameters to the equivalent decoding model, allowing predictions to be made about unseen test data in the same way as in standard decoding methodology. These SpatioTemporally Resolved MVPA (STRM) models can be flexibly applied to a wide variety of experimental paradigms, including classification and regression tasks. We show that these models provide insightful maps of the activity driving predictive accuracy metrics; demonstrate behaviourally meaningful variation in the timing of pattern emergence on individual trials; and achieve predictive accuracies that are either equivalent or surpass those achieved by more widely used methods. This provides a new avenue for investigating the brain's representational dynamics and could ultimately support more flexible experimental designs in the future.
... On the one hand, this approach is highly interpretable and supports relatively straightforward analyses of the impacts of different behavioural variables on state transition times; on the other hand, this may come at a cost to predictive accuracy around state boundaries. We furthermore 904 expect, given the challenge of overfitting with these datasets that motivated our sequential state constraint, that any comparable dynamic models with a continuous state space (eg (Penny & Roberts, 1999)) would be very difficult to suitably regularise. ...
Preprint
Full-text available
An emerging goal in neuroscience is tracking what information is represented in brain activity over time as a participant completes some task. Whilst EEG and MEG offer millisecond temporal resolution of how activity patterns emerge and evolve, standard decoding methods present significant barriers to interpretability as they obscure the underlying spatial and temporal activity patterns. We instead propose the use of a generative encoding model framework that simultaneously infers the multivariate spatial patterns of activity and the variable timing at which these patterns emerge on individual trials. An encoding model inversion allows predictions to be made about unseen test data in the same way as in standard decoding methodology. These SpatioTemporally Resolved MVPA (STRM) models can be flexibly applied to a wide variety of experimental paradigms, including classification and regression tasks. We show that these models provide insightful maps of the activity driving predictive accuracy metrics; demonstrate behaviourally meaningful variation in the timing of pattern emergence on individual trials; and achieve predictive accuracies that are either equivalent or surpass those achieved by more widely used methods. This provides a new avenue for investigating the brain’s representational dynamics and could ultimately support more flexible experimental designs in future. HIGHLIGHTS We introduce SpatioTemporally Resolved MVPA (STRM), an approach that explicitly models how successive stages of stimulus processing are distributed in both space and time in M/EEG data. We show that STRM is broadly applicable to diverse types of M/EEG data and outputs meaningful and interpretable maps of how neural representations evolve in space and time at millisecond resolution. The trial-specific deviations in activity pattern timings identified by STRM are not random, but vary systematically with inter-trial differences in behavioural, cognitive and physiological variables. These methods result in predictive accuracy metrics that are mostly equivalent to, or a modest improvement on, conventional methods.
Article
We develop a fully Bayesian tracking algorithm with the purpose of providing classification prediction results that are unbiased when applied uniformly to individuals with differing sensitive variable values, e.g., of different races, sexes, etc. Here, we consider bias in the form of group-level differences in false prediction rates between the different sensitive variable groups. Given that the method is fully Bayesian, it is well suited for situations where group parameters or regression coefficients are dynamic quantities. We illustrate our method, in comparison to others, on simulated datasets and two real-world datasets.
Chapter
Brain-computer interfaces (BCIs) are computerized systems that convert brain activity into control commands to operate software or external devices. Though promising, BCIs currently have limited practicality and usership due to poor signal classification and large training data requirements. The present study aims to overcome both challenges by combining three brain signals. This paradigm could improve existing BCI technical efficacy, and extrapolate to applications where hands-free visual interfaces could equip users with communication and information resources that improve work processes.
Chapter
The proposed method (FraudFox) provides solutions to adversarial attacks in a resource constrained environment. We focus on questions like the following: How suspicious is ‘Smith’, trying to buy $500 shoes, on Monday 3am? How to merge the risk scores, from a handful of risk-assessment modules (‘oracles’) in an adversarial environment? More importantly, given historical data (orders, prices, and what-happened afterwards), and business goals/restrictions, which transactions, like the ‘Smith’ transaction above, which ones should we ‘pass’, versus send to human investigators? The business restrictions could be: ‘at most x investigations are feasible’, or ‘at most $y lost due to fraud’. These are the two research problems we focus on, in this work. One approach to address the first problem (‘oracle-weighting’), is by using Extended Kalman Filters with dynamic importance weights, to automatically and continuously update our weights for each ‘oracle’. For the second problem, we show how to derive an optimal decision surface, and how to compute the Pareto optimal set, to allow what-if questions. An important consideration is adaptation: Fraudsters will change their behavior, according to our past decisions; thus, we need to adapt accordingly. The resulting system, FraudFox, is scalable, adaptable to changing fraudster behavior, effective, and already in production at Amazon. FraudFox augments a fraud prevention sub-system and has led to significant performance gains.
Article
Full-text available
Bangladesh is a culturally conservative nation with limited freedom for women. A number of studies have evaluated intimate partner violence (IPV) and spousal physical violence in Bangladesh; however, the views of women have been rarely discussed in a quantitative manner. Three nationwide surveys in Bangladesh (2007, 2011, and 2014) were analyzed in this study to characterize the most vulnerable households, where women themselves accepted spousal physical violence as a general norm. 31.3%, 31.9% and 28.7% women in the surveys found justification for physical violence in household in 2007, 2011 and 2014 respectively. The binary logistic model showed wealth index, education of both women and their partner, religion, geographical division, decision making freedom and marital age as significant household contributors for women’s perspective in all the three years. Women in rich households and the highly educated were found to be 40% and 50% less likely to accept domestic physical violence compared to the poorest and illiterate women. Similarly, women who got married before 18 years were 20% more likely accept physical violence in the family as a norm. Apart from these particular groups (richest, highly educated and married after 18 years), other groups had around 30% acceptance rate of household violence. For any successful attempt to reduce spousal physical violence in the traditional patriarchal society of Bangladesh, interventions must target the most vulnerable households and the geographical areas where women experience spousal violence. Although this paper focuses on women’s attitudes, it is important that any intervention scheme should be devised to target both men and women.
Article
Full-text available
A não-utilização de equipamento de proteção individual (EPI) está associada à transmissão de doenças. Foi realizado um inquérito, através de questionário autoaplicável, para determinar a prevalência e fatores associados ao uso de EPI, a fim de verificar se o uso ocorreu segundo as recomendações vigentes, bem como as principais razões alegadas para o não uso entre dentistas de Montes Claros (MG). A determinação dos fatores associados ao uso simultâneo do EPI foi feita através de regressão logística múltipla. Dos 299 questionários, 296 foram respondidos. Com isso pode se observar que prevalência de uso de luvas foi de 99%, máscaras de 98%, avental de 88%, gorro de 68% e óculos de 86%. A principal razão alegada para não usar EPI foi julgar desnecessário. Apenas 22,8% relataram uso conjunto desses equipamentos 100% do tempo, e nenhum entrevistado atendeu a todas as recomendações. Na análise multivariada, o uso foi maior entre mulheres (OR=4,3; IC 95%: 2,03-8,93), cirurgiões, periodontistas (OR=5,3; IC 95%: 1,64-16,95) e entre os que relataram alta satisfação com o trabalho (OR=2,8; IC 95%: 1,12-7,01) e menor entre aqueles com mais de 40 anos (OR=0,18; IC 95%: 0.07-0,47). O uso parece estar crescendo entre CDs jovens e do sexo feminino. Entretanto, ainda é baixo o uso simultâneo e adequado do EPI, fundamental para o controle da infecção cruzada.
Thesis
Full-text available
Dentre as necessidades básicas dos indivíduos, o morar tem papel de grande relevância, já que representa segurança e privacidade, do ponto de vista individual, e acessibilidade, considerando a localização da habitação frente ao conjunto de processos e fenômenos distribuídos diferencialmente no território. É na interação entre o movimento dos grupos populacionais, através da migração e da mobilidade residencial, e os resultados territoriais da distribuição espacial desses grupos nos municípios componentes da Região Metropolitana de Campinas que se assenta o foco desta pesquisa. Tendo esta região permanecido como importante pólo demográfico para os fluxos migratório interestaduais e intraestaduais, continuou a receber relevante volume de migrantes das mais variadas características, apesar das mudanças econômicas e sociais das últimas décadas ter alterado o volume e direcionamento dos fluxos migratórios em nível nacional. Como resultado, se verifica o recebimento de migrantes de longas e curtas distâncias, além dos movimentos internos que tem se sobressaído para a produção e estruturação do espaço urbano regional, a partir da expansão de áreas periféricas com distintas características construtivas e de infraestrutura. A mobilidade, neste início de século XXI, tem culminado num aprofundamento das desigualdades territoriais, já que, em grande parte, a expansão territorial ocorre seguindo as características dos grupos sociais predominantes nas áreas em que são construídos os produtos habitacionais, pela valorização da localização, além do aumento da relevância da mobilidade residencial intrametropolitana para os processos de produção e estruturação do espaço urbano regional. Os incentivos e constrangimentos, que são os fatores motivadores e colocam a população em movimento estão, cada vez mais, mobilizando grupos sociais mais bem posicionados em relação a escolaridade e renda, fato novo em relação ao que se observava em momentos anteriores e que tem grande destaque na forma e característica da expansão urbana observada nos municípios RM de Campinas. A análise das diferenças e similaridades entre migrantes e não-migrantes, além de migrantes em suas distintas modalidades, é uma das escolhas metodológicas utilizadas para compreender e avançar na análise e compreensão da dinâmica migratória na RM de Campinas.
Article
Dynamic Bayesian models are developed for application in nonlinear, non-normal time series and regression problems, providing dynamic extensions of standard generalized linear models. A key feature of the analysis is the use of conjugate prior and posterior distributions for the exponential family parameters. This leads to the calculation of closed, standard-form predictive distributions for forecasting and model criticism. The structure of the models depends on the time evolution of underlying state variables, and the feedback of observational information to these variables is achieved using linear Bayesian prediction methods. Data analytic aspects of the models concerning scale parameters and outliers are discussed, and some applications are provided.Dynamic Bayesian models are developed for application in nonlinear, non-normal time series and regression problems, providing dynamic extensions of standard generalized linear models. A key feature of the analysis is the use of conjugate prior and posterior distributions for the exponential family parameters. This leads to the calculation of closed, standard-form predictive distributions for forecasting and model criticism. The structure of the models depends on the time evolution of underlying state variables, and the feedback of observational information to these variables is achieved using linear Bayesian prediction methods. Data analytic aspects of the models concerning scale parameters and outliers are discussed, and some applications are provided.
Article
A directed acyclic graph or influence diagram is frequently used as a representation for qualitative knowledge in some domains in which expert system techniques have been applied, and conditional probability tables on appropriate sets of variables form the quantitative part of the accumulated experience. It is shown how one can introduce imprecision into such probabilities as a data base of cases accumulates. By exploiting the graphical structure, the updating can be performed locally, either approximately or exactly, and the setup makes it possible to take advantage of a range of well-established statistical techniques. As examples we discuss discrete models, models based on Dirichlet distributions and models of the logistic regression type.
Article
The field of time series analysis is explored from its logical foundations to the most modern data analysis techniques. The presentation is developed, as far as possible, for continuous data, so that the inevitable use of discrete mathematics is postponed until the reader has gained some familiarity with the concepts. The monograph seeks to provide the reader with both the theoretical overview and the practical details necessary to correctly apply the full range of these powerful techniques. In addition, the last chapter introduces many specialized areas where research is currently in progress.