Article
Graphical modeling of binary data using the LASSO: a simulation study.
Institute for Medical Informatics, Biometrics and Epidemiology, Ludwig-Maximilians-Universität München, Munich, Germany.
BMC Medical Research Methodology (impact factor:
2.67).
02/2012;
12:16.
DOI:10.1186/1471-2288-12-16
pp.16
Source: PubMed
- Citations (22)
-
Cited In (0)
-
Article: Heuristics of instability and stabilization in model selection
[show abstract] [hide abstract]
ABSTRACT: In model selection, usually a "best" predictor is chosen from a collection ${\hat{\mu}(\cdot, s)}$ of predictors where $\hat{\mu}(\cdot, s)$ is the minimum least-squares predictor in a collection $\mathsf{U}_s$ of predictors. Here s is a complexity parameter; that is, the smaller s, the lower dimensional/smoother the models in $\mathsf{U}_s$. ¶ If $\mathsf{L}$ is the data used to derive the sequence ${\hat{\mu}(\cdot, s)}$, the procedure is called unstable if a small change in $\mathsf{L}$ can cause large changes in ${\hat{\mu}(\cdot, s)}$. With a crystal ball, one could pick the predictor in ${\hat{\mu}(\cdot, s)}$ having minimum prediction error. Without prescience, one uses test sets, cross-validation and so forth. The difference in prediction error between the crystal ball selection and the statistician's choice we call predictive loss. For an unstable procedure the predictive loss is large. This is shown by some analytics in a simple case and by simulation results in a more complex comparison of four different linear regression methods. Unstable procedures can be stabilized by perturbing the data, getting a new predictor sequence ${\hat{\mu'}(\cdot, s)}$ and then averaging over many such predictor sequences. -
Article: Methodological considerations, such as directed acyclic graphs, for studying "acute on chronic" disease epidemiology: chronic obstructive pulmonary disease example.
[show abstract] [hide abstract]
ABSTRACT: Acute exacerbations of chronic disease are ubiquitous in clinical medicine, and thus far, there has been a paucity of integrated methodological discussion on this phenomenon. We use acute exacerbations of chronic obstructive pulmonary disease as an example to emphasize key epidemiological and statistical issues for this understudied field in clinical epidemiology. Directed acyclic graphs are a useful epidemiological tool to explain the differential effects of risk factor on health outcomes in studies of acute and chronic phases of disease. To study the pathogenesis of acute exacerbations of chronic disease, case-crossover design and time-series analysis are well-suited study designs to differentiate acute and chronic effect. Modeling changes over time and setting appropriate thresholds are important steps to separate acute from chronic phases of disease in serial measurements. In statistical analysis, acute exacerbations are recurrent events, and some individuals are more prone to recurrences than others. Therefore, appropriate statistical modeling should take into account intraindividual dependence. Finally, we recommend the use of "event-based" number needed to treat (NNT) to prevent a single exacerbation instead of traditional patient-based NNT. Addressing these methodological challenges will advance research quality in acute on chronic disease epidemiology.Journal of clinical epidemiology 03/2009; 62(9):982-90. · 2.96 Impact Factor -
Article: Graphical models illustrated complex associations between variables describing human functioning.
[show abstract] [hide abstract]
ABSTRACT: To examine whether graphical modeling is a potentially useful method for the study of human functioning using data collected by means of the International Classification of Functioning, Disability and Health (ICF). The applicability of the method was examined in a convenience sample of 616 patients from a cross-sectional multicentric study undergoing early postacute rehabilitation. Functioning was qualified using 115 second-level ICF categories. The modeling was carried out on a data set with imputed missing values. The least absolute shrinkage and selection operator (LASSO) for generalized linear models was used to identify conditional dependencies between the ICF categories. Bootstrap aggregating was used to enhance the accuracy and validity of model selection. The resulting graph showed highly meaningful relationships. For example, one structure centered around speaking and included three paths addressing conversation, speech functions, and mental functions of language. Graphical modeling of human functioning using data collected by means of the ICF yields clinically meaningful results. The structures found may be the basis for the identification of suitable targets for rehabilitation interventions, the identification of confounders and intermediate variables, and the selection of parsimonious sets of variables for multivariate epidemiological modeling.Journal of clinical epidemiology 07/2009; 62(9):922-33. · 2.96 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed.
The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual
current impact factor.
Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence
agreement may be applicable.
Keywords
binary data
Bootstrap aggregating
building Gaussian graphical models
conservative models
continuous multivariate data
derive graphical models
dimensional binary data
Graphical models
modeling high-dimensional clinical data
multivariate normal distribution
promising new approach
random data
real-life data
Satisfactory solutions
simulation study
small penalty term
small sample size
symmetric local logistic regression models
variable selection
Youden Index.We