We propose an online learning algorithm for training a logistic regression model on nonstationary classification problems. The nonstationarity is captured by modelling the weights in a logistic regression classifier as evolving according to a first order Markov process. The weights are updated using the extended Kalman filter formalism and nonstationarities are tracked by inferring a time-varying state noise variance parameter. We describe an algorithm for doing this based on maximising the evidence of updated predictions. The algorithm is illustrated on a number of synthetic problems. I. Introduction This paper proposes an online learning algorithm for training a logistic regression model on nonstationary classification problems. By nonstationary 1 we mean that the statistics of each class may vary with time or, equivalently from the classification perspective, that the optimal decision boundary changes with time. The simplest online algorithm for training a logistic regression m...
All content in this area was uploaded by Stephen J. Roberts on Dec 23, 2013
Content may be subject to copyright.
200400
0
0.1
q
t
0200400600
0
0.2
0.4
0.6
0.8
t
<α>
0100200300400500600
−1
0
1
2
3
0100200300400500600
−1
0
1
2
0100200300400500600
−1
0
1
2
0200400600
0
50
100
150
200
250
300
350
E
t
Nonstationary
Stationary
St. Descent
−2−101234567
−2
−1
0
1
2
3
4
5
6
7
−2−10123456
−2
−1
0
1
2
3
4
5
6
100200300
0
0.1
0.2
q
t
0100200300
0
0.5
1
1.5
2
t
<α>
0100200300
0
20
40
60
80
100
E
t
Nonstationary
Stationary
St. Descent
... In the following, the models exploited to define the two criteria used to decide when to query the labels are introduced. The two criteria are label uncertainty defined through logistic regression [25] and density-based criterion defined through Growing Gaussian Mixture Model (GGMM) [1]. ...
... In order to calculate the expression p(θ t |θ t−1 ), we must specify how the parameters change over time. Following [25], we assume no knowledge of the drifting distribution p(θ t |θ t−1 ). Thus, Eq. (15) can be eliminated by estimating p(y t |x t , D t−1 ) which is done as follows: ...
... 2) Handling of concept drift: In-non stationary setting, a variant version of (21) proposed in [25] is used. The situation is exactly the same as for the stationary case, except that the prior distribution is now N (µ t−1 , Σ t−1 + v t f ). ...
Active learning (AL) is a promising way to efficiently build up training sets with minimal supervision. A learner deliberately queries specific instances to tune the classifier's model using as few labels as possible. The challenge for streaming is that the data distribution may evolve over time, and therefore the model must adapt. Another challenge is the sampling bias where the sampled training set does not reflect the underlying data distribution. In the presence of concept drift, sampling bias is more likely to occur as the training set needs to represent the whole evolving data. To tackle these challenges, we propose a novel bi-criteria AL (BAL) approach that relies on two selection criteria, namely, label uncertainty criterion and density-based criterion. While the first criterion selects instances that are the most uncertain in terms of class membership, the latter dynamically curbs the sampling bias by weighting the samples to reflect on the true underlying distribution. To design and implement these two criteria for learning from streams, BAL adopts a Bayesian online learning approach and combines online classification and online clustering through the use of online logistic regression and online growing Gaussian mixture models, respectively. Empirical results obtained on standard synthetic and real-world benchmarks show the high performance of the proposed BAL method compared with the state-of-the-art AL methods.
... Bayesian approach to evolve a time-dependent set of weights under a linear dynamical system yields a suitably adaptive system in the presence of a non-stationary data set [1]. In this paper, we present an algorithm for adaptive nonlinear two-class distinction which extends the non-stationary logistic regression described in [5] Logistic regression is a well known method for classifying the value of a dependent variable based on a set of inputs. Various methods have been explored for extending logistic regression to problems in which the optimal decision boundary between two classes is time-dependent [5]. ...
... In this paper, we present an algorithm for adaptive nonlinear two-class distinction which extends the non-stationary logistic regression described in [5] Logistic regression is a well known method for classifying the value of a dependent variable based on a set of inputs. Various methods have been explored for extending logistic regression to problems in which the optimal decision boundary between two classes is time-dependent [5]. We present here a brief introduction to logistic regression and its non-stationary successors. ...
... Thus, the activation mean remains unaffected, but its variance has grown and leads to more moderated predictions. Once the class label z t for input ϕ t , has been observed, we adjust the covariance and the mean according to [5] Σ ...
Analysis of EEG for Brain Computer Interfacing (BCI) requires robust tools for discerning be-tween brain states. In this paper, we explore an algorithm for extracting movement/non-movement data from EEG. We begin by projecting features from a dynamical system model of the EEG onto a nonlinear basis space. The basis function responses are then mapped via a logistic classifier onto a class-posterior decision space. The non-stationarity of the EEG signals is captured by parame-terizing this mapping via a set of dynamically adaptive, time-dependent weights. We update these weights under a sequential Bayesian learning paradigm. Importantly, we aim such an adaptive clas-sifier toward a system in which very few class labels are known -such as is the case for self-paced BCI experiments.
... MLR can be used to predict a response variable on the basis of continuous and/or categorical explanatory variables to determine the percent of variance in the response variable explained by the explanatory variables; to rank the relative importance of independents to assess interaction effects; and to understand the impact of covariate control variables. MLR allows the simultaneous comparison of more than one contrast ,that is the log odds of three or more contrasts are estimated simultaneously (Garson, 2009). If the response variable has more than two values, and there is no natural ordering of the categories, it called Multinomial Logistic Regression. ...
Multinomial logistic regression(MLR) and Discriminant Analysis (DA) are two techniques that commonly used for data classification. Both of them are applied at Labor Force in Palestine data 2012 in order to predict the probability of a specific categorical of Labor Force (LF) based upon several explanatory variables. we used real data on LF, from a survey of LF 2012 which was conducted by Palestinian Central Bureau of Statistics(PCBS). The data sample size had been 25353 observations from West Bank and Gaza Strip. The target group was the age group (15- 65) years for both sexes. Labor Force data has 12 variables; the dependent variable is nominal with three categories and 11 independent variables. So, we have two models for each techniques. Correct classification is 83.5% for MLR model compared with 81.1% for DA. In addition that the area under the ROC curve is 91.89% for MLR and 52.8% for DA These results demonstrate that MLR can be more powerful analytical technique.
Key Words: Confusion Matrix – Roc curve – Multinomial Logistic Regression – Discriminant Analysis - Odds ratio
... An altogether different approach to the problem of handling drift is to explicitly model the dynamics governing the evolution of the classes and decision boundary, via state-space modeling [38]. Penny and Roberts [39] assume that the evolution of the true parameters of a logistic regression model follows the simple Markov dynamics, giving rise to a state-space model whose structure is assumed known, and in turn determines the type of drift the estimator can handle. Similarly, Whittaker et al. [40] deploy a Kalman filter to monitor the parameters in a credit scoring classifier, providing a charting mechanism to identify when changes in the parameters occur. ...
Bangladesh is a culturally conservative nation with limited freedom for women. A number of studies have evaluated intimate partner violence (IPV) and spousal physical violence in Bangladesh; however, the views of women have been rarely discussed in a quantitative manner. Three nationwide surveys in Bangladesh (2007, 2011, and 2014) were analyzed in this study to characterize the most vulnerable households, where women themselves accepted spousal physical violence as a general norm. 31.3%, 31.9% and 28.7% women in the surveys found justification for physical violence in household in 2007, 2011 and 2014 respectively. The binary logistic model showed wealth index, education of both women and their partner, religion, geographical division, decision making freedom and marital age as significant household contributors for women’s perspective in all the three years. Women in rich households and the highly educated were found to be 40% and 50% less likely to accept domestic physical violence compared to the poorest and illiterate women. Similarly, women who got married before 18 years were 20% more likely accept physical violence in the family as a norm. Apart from these particular groups (richest, highly educated and married after 18 years), other groups had around 30% acceptance rate of household violence. For any successful attempt to reduce spousal physical violence in the traditional patriarchal society of Bangladesh, interventions must target the most vulnerable households and the geographical areas where women experience spousal violence. Although this paper focuses on women’s attitudes, it is important that any intervention scheme should be devised to target both men and women.
A não-utilização de equipamento de proteção individual (EPI) está associada à transmissão de doenças. Foi realizado um inquérito, através de questionário autoaplicável, para determinar a prevalência e fatores associados ao uso de EPI, a fim de verificar se o uso ocorreu segundo as recomendações vigentes, bem como as principais razões alegadas para o não uso entre dentistas de Montes Claros (MG). A determinação dos fatores associados ao uso simultâneo do EPI foi feita através de regressão logística múltipla. Dos 299 questionários, 296 foram respondidos. Com isso pode se observar
que prevalência de uso de luvas foi de 99%, máscaras de 98%, avental de 88%, gorro de 68% e óculos de 86%. A principal razão alegada para não usar EPI foi julgar desnecessário. Apenas 22,8% relataram uso conjunto desses equipamentos 100% do tempo, e nenhum entrevistado atendeu a todas as recomendações. Na análise multivariada, o uso foi maior entre mulheres (OR=4,3; IC 95%: 2,03-8,93), cirurgiões, periodontistas (OR=5,3; IC 95%: 1,64-16,95) e entre os que relataram
alta satisfação com o trabalho (OR=2,8; IC 95%: 1,12-7,01) e menor entre aqueles com mais de 40 anos (OR=0,18; IC 95%: 0.07-0,47). O uso parece estar crescendo entre CDs jovens e do sexo feminino. Entretanto, ainda é baixo o uso simultâneo e adequado do EPI, fundamental para o controle da infecção cruzada.
Dentre as necessidades básicas dos indivíduos, o morar tem papel de grande relevância, já que representa segurança e privacidade, do ponto de vista individual, e acessibilidade, considerando a localização da habitação frente ao conjunto de processos e fenômenos distribuídos diferencialmente no território. É na interação entre o movimento dos grupos populacionais, através da migração e da mobilidade residencial, e os resultados territoriais da distribuição espacial desses grupos nos municípios componentes da Região Metropolitana de Campinas que se assenta o foco desta pesquisa. Tendo esta região permanecido como importante pólo demográfico para os fluxos migratório interestaduais e intraestaduais, continuou a receber relevante volume de migrantes das mais variadas características, apesar das mudanças econômicas e sociais das últimas décadas ter alterado o volume e direcionamento dos fluxos migratórios em nível nacional. Como resultado, se verifica o recebimento de migrantes de longas e curtas distâncias, além dos movimentos internos que tem se sobressaído para a produção e estruturação do espaço urbano regional, a partir da expansão de áreas periféricas com distintas características construtivas e de infraestrutura. A mobilidade, neste início de século XXI, tem culminado num aprofundamento das desigualdades territoriais, já que, em grande parte, a expansão territorial ocorre seguindo as características dos grupos sociais predominantes nas áreas em que são construídos os produtos habitacionais, pela valorização da localização, além do aumento da relevância da mobilidade residencial intrametropolitana para os processos de produção e estruturação do espaço urbano regional. Os incentivos e constrangimentos, que são os fatores motivadores e colocam a população em movimento estão, cada vez mais, mobilizando grupos sociais mais bem posicionados em relação a escolaridade e renda, fato novo em relação ao que se observava em momentos anteriores e que tem grande destaque na forma e característica da expansão urbana observada nos municípios RM de Campinas. A análise das diferenças e similaridades entre migrantes e não-migrantes, além de migrantes em suas distintas modalidades, é uma das escolhas metodológicas utilizadas para compreender e avançar na análise e compreensão da dinâmica migratória na RM de Campinas.
This paper describes a new method for online logistic regression when the feature vectors lie close to a low-dimensional manifold and when observations of the feature vectors may be noisy or have missing elements. The new method exploits the low-dimensional structure of the feature vector, finds a multi-scale union of linear subsets that approximates the manifold, and performs online logistic regression separately on each subset. The union of subsets enables better performance in the face of noisy and missing data, and offsets challenges associated with the curse of dimensionality. The effectiveness of the proposed method in predicting correct labels of the data and in adapting to slowly time-varying manifolds are demonstrated using numerical examples and real data.
Dynamic Bayesian models are developed for application in nonlinear, non-normal time series and regression problems, providing dynamic extensions of standard generalized linear models. A key feature of the analysis is the use of conjugate prior and posterior distributions for the exponential family parameters. This leads to the calculation of closed, standard-form predictive distributions for forecasting and model criticism. The structure of the models depends on the time evolution of underlying state variables, and the feedback of observational information to these variables is achieved using linear Bayesian prediction methods. Data analytic aspects of the models concerning scale parameters and outliers are discussed, and some applications are provided.Dynamic Bayesian models are developed for application in nonlinear, non-normal time series and regression problems, providing dynamic extensions of standard generalized linear models. A key feature of the analysis is the use of conjugate prior and posterior distributions for the exponential family parameters. This leads to the calculation of closed, standard-form predictive distributions for forecasting and model criticism. The structure of the models depends on the time evolution of underlying state variables, and the feedback of observational information to these variables is achieved using linear Bayesian prediction methods. Data analytic aspects of the models concerning scale parameters and outliers are discussed, and some applications are provided.
A directed acyclic graph or influence diagram is frequently used as a representation for qualitative knowledge in some domains in which expert system techniques have been applied, and conditional probability tables on appropriate sets of variables form the quantitative part of the accumulated experience. It is shown how one can introduce imprecision into such probabilities as a data base of cases accumulates. By exploiting the graphical structure, the updating can be performed locally, either approximately or exactly, and the setup makes it possible to take advantage of a range of well-established statistical techniques. As examples we discuss discrete models, models based on Dirichlet distributions and models of the logistic regression type.
The field of time series analysis is explored from its logical foundations to the most modern data analysis techniques. The presentation is developed, as far as possible, for continuous data, so that the inevitable use of discrete mathematics is postponed until the reader has gained some familiarity with the concepts. The monograph seeks to provide the reader with both the theoretical overview and the practical details necessary to correctly apply the full range of these powerful techniques. In addition, the last chapter introduces many specialized areas where research is currently in progress.