B. D. Ripley’s research while affiliated with University of Oxford and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (94)


Classification ? Further Developments
  • Chapter

September 2014

·

3 Reads

B. D. Ripley

This article has no abstract.



Estimating Disease Prevalence Using Relatives of Case and Control Probands

June 2009

·

35 Reads

·

10 Citations

Biometrics

We introduce a method of estimating disease prevalence from case-control family study data. Case-control family studies are performed to investigate the familial aggregation of disease; families are sampled via either a case or a control proband, and the resulting data contain information on disease status and covariates for the probands and their relatives. Here, we introduce estimators for overall prevalence and for covariate-stratum-specific (e.g., sex-specific) prevalence. These estimators combine the proportion of affected relatives of control probands with the proportion of affected relatives of case probands and are designed to yield approximately unbiased estimates of their population counterparts under certain commonly made assumptions. We also introduce corresponding confidence intervals designed to have good coverage properties even for small prevalences. Next, we describe simulation experiments where our estimators and intervals were applied to case-control family data sampled from fictional populations with various levels of familial aggregation. At all aggregation levels, the resulting estimates varied closely and symmetrically around their population counterparts, and the resulting intervals had good coverage properties, even for small sample sizes. Finally, we discuss the assumptions required for our estimators to be approximately unbiased, highlighting situations where an alternative estimator based only on relatives of control probands may perform better.


Pattern Recognition And Neural Networks

January 2008

·

1,094 Reads

·

3,727 Citations

Ripley brings together two crucial ideas in pattern recognition: statistical methods and machine learning via neural networks. He brings unifying principles to the fore, and reviews the state of the subject. Ripley also includes many examples to illustrate real problems in pattern recognition and how to overcome them.


An “Unfolding” Latent Variable Model for Likert Attitude Data

June 2007

·

78 Reads

·

55 Citations

Likert attitude data consist of responses to favorable and unfavorable statements about an entity, where responses fall into ordered categories ranging from disagreement to agreement. Social science and marketing researchers frequently use data of this type to measure attitudes toward an entity such as a policy or product. We focus on data on American and British attitudes toward their respective nations (”national pride”). We introduce a multidimensional unfolding model (MUM) to describe the relationship between the data and the attitudes underlying them. Unlike most existing models, the MUM allows the data to reflect not just attitudes, but also response style, which is defined as a consistent and content-independent pattern of response category selection such as a tendency to agree with all statements. The MUM can be used to model multiple attitudes, which allows researchers to expand their analysis of the data of interest to include all available Likert data so as to increase information on response style. For example, we include additional data on immigration attitudes to help distinguish the effects of response style and national pride on our data. The MUM can be used to fit linear models for the effects of background variables on attitudes. Resulting inferences about attitudes are adjusted for response style and should be less biased. Simulation results strongly suggest that, unlike Likert’s popular scoring model, the MUM yields unbiased inferences even when there are unequal proportions of favorable and unfavorable statements.


An "Unfolding" Latent Variable Model for Likert Attitude Data: Drawing Inferences Adjusted for Response Style

June 2007

·

96 Reads

·

56 Citations

Journal of the American Statistical Association

Likert attitude data consist of responses to favorable and unfavorable statements about an entity, where responses fall into ordered categories ranging from disagreement to agreement. Social science and marketing researchers frequently use data of this type to measure attitudes toward an entity such as a policy or product. We focus on data on American and British attitudes toward their respective nations ("national pride"). We introduce a multidimensional unfolding model (MUM) to describe the relationship between the data and the attitudes underlying them. Unlike most existing models, the MUM allows the data to reflect not just attitudes, but also response style, which is defined as a consistent and content-independent pattern of response category selection such as a tendency to agree with all statements. The MUM can be used to model multiple attitudes, which allows researchers to expand their analysis of the data of interest to include all available Likert data so as to increase information on response style. For example, we include additional data on immigration attitudes to help distinguish the effects of response style and national pride on our data. The MUM can be used to fit linear models for the effects of background variables on attitudes. Resulting inferences about attitudes are adjusted for response style and should be less biased. Simulation results strongly suggest that, unlike Likert's popular scoring model, the MUM yields unbiased inferences even when there are unequal proportions of favorable and unfavorable statements.




Computer‐Intensive Methods

July 2005

·

52 Reads

·

5 Citations

One sense of “computer-intensive” statistics is just statistical methodology that makes use of a large amount of computer time. (Examples include the bootstrap, jackknife, smoothing, image analysis, and many uses of the EM algorithm.) However, the term is usually used for methods that go beyond the minimum of calculations needed for an illuminating analysis, for example, by replacing analytic approximations by computational ones, or requiring numeric optimization or integration over high-dimensional spaces. We introduce the subject with a very simple yet useful example, and then consider some of the areas in which computer-intensive methods are used, to give a flavor of current research. Keywords: boostrap; jacknife; image analysis; optimization; Monte Carlo



Citations (69)


... Since 1985, Neural Networks have rapidly evolved and gained widespread use. These algorithms have emerged as a versatile class of non-linear regression techniques, proving effective for both classification and regression tasks [37]. Narrow neural network, Medium neural network, wide neural network, Bi-layered neural network and tri-layered neutal network are types of neural network. ...

Reference:

Elemental Analysis and Classification of Nicotine Pouches Using Machine Learning Assisted Laser Induced Breakdown Spectroscopy
Neural Networks and Related Methods for Classification
  • Citing Article
  • September 1994

Journal of the Royal Statistical Society Series B (Methodological)

... The presence of statistically significant linear relationships between variables was verified using the standard Pearson least squares method and a robust MM estimation method that is less sensitive to outliers and non-normally distributed errors (Susanti et al., 2014;Venables & Ripley, 1999). The letters Davide A.L. Vignati et al. ...

Statistics and Computing
  • Citing Book
  • January 1999

... Há agora alguns livros que descrevem como usar o R para análise de dados e estatísticas, bem como documentação para o S e o S-Plus, que podem ser usados juntamente com o R, mantendo as diferenças entre eles(VENABLES, 2004).Muitas pessoas usam o R como um sistema estatístico e, sendo assim, muitas estatísticas modernas foram implementadas. Algumas delas são construídas na base R de desenvolvimento e outras são fornecidas como "pacotes". ...

Introduction
  • Citing Chapter
  • January 1997

... The association between loneliness, and disability and learning type was assessed using an ordinal logistic regression model; the association between CGA and disability was assessed using an ordinal logistic regression model also. Both ordinal regression models used the MASS package within R (Venables and Ripley, 2002). Findings were reported as odds ratios (OR) for both model 1 and 2, reporting error as 95% CIs. ...

Generalized Linear Models
  • Citing Article
  • January 2002

... For each model, we tested the variables one by one against the null model, then kept those where the difference was significant (p < 0.05). Finally, we created a model containing the selected variables and chose the best model as the one with the lowest Akaike Information Crite-rion (AIC) (Sakamoto et al. 1986) using the a forward stepwise algorithm (Venables and Ripley 2002) from stepAIC (package MASS) (Ripley et al. 2013). Interactions between variables were tested where biologically relevant. ...

Random and mixed effects. Modern applied statistics with S. statistics and computing. New York: Springer
  • Citing Book
  • January 1997

... Data for CMD and CBSD assessment survey in cassava fields was statistically analysed using R version 3.6.1 [26]. Generalised linear models (glm) were used for analysis of deviance (ANODEV) using MASS package [27]. Negative binomial linear model was used for analysis of all other parameters except severity, while quasibinomial linear model was used for severity. ...

Modern Applied Statistics with S
  • Citing Book
  • January 2002

... 'Species' represents a response variable that falls in a non-ordered finite set of categories. Accordingly, multinomial data are given [40,41]. Multinomial logistic regression was used to model nominal outcome variables in which the log odds of the outcomes were modeled as a linear combination of the predictor variables. ...

Exploratory Multivariate Analysis
  • Citing Chapter
  • January 2002

... The initial training and subsequent 10-fold cross-validation were orchestrated using the 'caret' package in R, with the receiver operating characteristic as the designated performance metric. Parameter calibration, specific to machine-learning approach, underwent a basic tuning process following defaults set by the package [13][14][15][16]. Additionally, receiver operating characteristic curves were generated using the 'pROC' package. ...

Classification
  • Citing Chapter
  • January 2002

... " Halliday (1967) adds that given information is often represented anaphorically, by means of reference (pronominals and demonstratives), substitutes (words like one and do), and ellipsis (no realisation in the text). Moreover, in English sentences, usually the portion bearing given information precedes the portion conveying new information (Quirk, Greenbaum, Leech, & Svartvik, 1972; Prince, 1978; Chafe, 1979; Kuno, 1980; Fries, 1983). The portion that bears the given information is often the complete subject, and the portion that bears the new information is often the complete predicate. ...

The S Language: Syntax and Semantics
  • Citing Chapter
  • January 2000