September 2014
·
3 Reads
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
September 2014
·
3 Reads
March 2011
·
75 Reads
Biometrics
June 2009
·
35 Reads
·
10 Citations
Biometrics
We introduce a method of estimating disease prevalence from case-control family study data. Case-control family studies are performed to investigate the familial aggregation of disease; families are sampled via either a case or a control proband, and the resulting data contain information on disease status and covariates for the probands and their relatives. Here, we introduce estimators for overall prevalence and for covariate-stratum-specific (e.g., sex-specific) prevalence. These estimators combine the proportion of affected relatives of control probands with the proportion of affected relatives of case probands and are designed to yield approximately unbiased estimates of their population counterparts under certain commonly made assumptions. We also introduce corresponding confidence intervals designed to have good coverage properties even for small prevalences. Next, we describe simulation experiments where our estimators and intervals were applied to case-control family data sampled from fictional populations with various levels of familial aggregation. At all aggregation levels, the resulting estimates varied closely and symmetrically around their population counterparts, and the resulting intervals had good coverage properties, even for small sample sizes. Finally, we discuss the assumptions required for our estimators to be approximately unbiased, highlighting situations where an alternative estimator based only on relatives of control probands may perform better.
January 2008
·
1,094 Reads
·
3,727 Citations
Ripley brings together two crucial ideas in pattern recognition: statistical methods and machine learning via neural networks. He brings unifying principles to the fore, and reviews the state of the subject. Ripley also includes many examples to illustrate real problems in pattern recognition and how to overcome them.
June 2007
·
78 Reads
·
55 Citations
Likert attitude data consist of responses to favorable and unfavorable statements about an entity, where responses fall into ordered categories ranging from disagreement to agreement. Social science and marketing researchers frequently use data of this type to measure attitudes toward an entity such as a policy or product. We focus on data on American and British attitudes toward their respective nations (”national pride”). We introduce a multidimensional unfolding model (MUM) to describe the relationship between the data and the attitudes underlying them. Unlike most existing models, the MUM allows the data to reflect not just attitudes, but also response style, which is defined as a consistent and content-independent pattern of response category selection such as a tendency to agree with all statements. The MUM can be used to model multiple attitudes, which allows researchers to expand their analysis of the data of interest to include all available Likert data so as to increase information on response style. For example, we include additional data on immigration attitudes to help distinguish the effects of response style and national pride on our data. The MUM can be used to fit linear models for the effects of background variables on attitudes. Resulting inferences about attitudes are adjusted for response style and should be less biased. Simulation results strongly suggest that, unlike Likert’s popular scoring model, the MUM yields unbiased inferences even when there are unequal proportions of favorable and unfavorable statements.
June 2007
·
96 Reads
·
56 Citations
Journal of the American Statistical Association
Likert attitude data consist of responses to favorable and unfavorable statements about an entity, where responses fall into ordered categories ranging from disagreement to agreement. Social science and marketing researchers frequently use data of this type to measure attitudes toward an entity such as a policy or product. We focus on data on American and British attitudes toward their respective nations ("national pride"). We introduce a multidimensional unfolding model (MUM) to describe the relationship between the data and the attitudes underlying them. Unlike most existing models, the MUM allows the data to reflect not just attitudes, but also response style, which is defined as a consistent and content-independent pattern of response category selection such as a tendency to agree with all statements. The MUM can be used to model multiple attitudes, which allows researchers to expand their analysis of the data of interest to include all available Likert data so as to increase information on response style. For example, we include additional data on immigration attitudes to help distinguish the effects of response style and national pride on our data. The MUM can be used to fit linear models for the effects of background variables on attitudes. Resulting inferences about attitudes are adjusted for response style and should be less biased. Simulation results strongly suggest that, unlike Likert's popular scoring model, the MUM yields unbiased inferences even when there are unequal proportions of favorable and unfavorable statements.
February 2007
·
15 Reads
Journal of the American Statistical Association
May 2006
·
23 Reads
·
18 Citations
July 2005
·
52 Reads
·
5 Citations
One sense of “computer-intensive” statistics is just statistical methodology that makes use of a large amount of computer time. (Examples include the bootstrap, jackknife, smoothing, image analysis, and many uses of the EM algorithm.) However, the term is usually used for methods that go beyond the minimum of calculations needed for an illuminating analysis, for example, by replacing analytic approximations by computational ones, or requiring numeric optimization or integration over high-dimensional spaces. We introduce the subject with a very simple yet useful example, and then consider some of the areas in which computer-intensive methods are used, to give a flavor of current research. Keywords: boostrap; jacknife; image analysis; optimization; Monte Carlo
July 2004
·
1 Read
... Since 1985, Neural Networks have rapidly evolved and gained widespread use. These algorithms have emerged as a versatile class of non-linear regression techniques, proving effective for both classification and regression tasks [37]. Narrow neural network, Medium neural network, wide neural network, Bi-layered neural network and tri-layered neutal network are types of neural network. ...
September 1994
Journal of the Royal Statistical Society Series B (Methodological)
... The presence of statistically significant linear relationships between variables was verified using the standard Pearson least squares method and a robust MM estimation method that is less sensitive to outliers and non-normally distributed errors (Susanti et al., 2014;Venables & Ripley, 1999). The letters Davide A.L. Vignati et al. ...
January 1999
... Há agora alguns livros que descrevem como usar o R para análise de dados e estatísticas, bem como documentação para o S e o S-Plus, que podem ser usados juntamente com o R, mantendo as diferenças entre eles(VENABLES, 2004).Muitas pessoas usam o R como um sistema estatístico e, sendo assim, muitas estatísticas modernas foram implementadas. Algumas delas são construídas na base R de desenvolvimento e outras são fornecidas como "pacotes". ...
January 1997
... The two-step approach described above ( Figure 2) is general; additional classification methods can be included, and the approach can be applied to any (natural landscape) classification task. Model selection and variable explanation were implemented in R [51]. ...
January 1994
... The association between loneliness, and disability and learning type was assessed using an ordinal logistic regression model; the association between CGA and disability was assessed using an ordinal logistic regression model also. Both ordinal regression models used the MASS package within R (Venables and Ripley, 2002). Findings were reported as odds ratios (OR) for both model 1 and 2, reporting error as 95% CIs. ...
January 2002
... For each model, we tested the variables one by one against the null model, then kept those where the difference was significant (p < 0.05). Finally, we created a model containing the selected variables and chose the best model as the one with the lowest Akaike Information Crite-rion (AIC) (Sakamoto et al. 1986) using the a forward stepwise algorithm (Venables and Ripley 2002) from stepAIC (package MASS) (Ripley et al. 2013). Interactions between variables were tested where biologically relevant. ...
January 1997
... Data for CMD and CBSD assessment survey in cassava fields was statistically analysed using R version 3.6.1 [26]. Generalised linear models (glm) were used for analysis of deviance (ANODEV) using MASS package [27]. Negative binomial linear model was used for analysis of all other parameters except severity, while quasibinomial linear model was used for severity. ...
January 2002
... 'Species' represents a response variable that falls in a non-ordered finite set of categories. Accordingly, multinomial data are given [40,41]. Multinomial logistic regression was used to model nominal outcome variables in which the log odds of the outcomes were modeled as a linear combination of the predictor variables. ...
January 2002
... The initial training and subsequent 10-fold cross-validation were orchestrated using the 'caret' package in R, with the receiver operating characteristic as the designated performance metric. Parameter calibration, specific to machine-learning approach, underwent a basic tuning process following defaults set by the package [13][14][15][16]. Additionally, receiver operating characteristic curves were generated using the 'pROC' package. ...
January 2002
... " Halliday (1967) adds that given information is often represented anaphorically, by means of reference (pronominals and demonstratives), substitutes (words like one and do), and ellipsis (no realisation in the text). Moreover, in English sentences, usually the portion bearing given information precedes the portion conveying new information (Quirk, Greenbaum, Leech, & Svartvik, 1972; Prince, 1978; Chafe, 1979; Kuno, 1980; Fries, 1983). The portion that bears the given information is often the complete subject, and the portion that bears the new information is often the complete predicate. ...
January 2000