What is a good method to select variables for Cox Regression?
Dear biostats community,
I am trying to build a Cox (Proportional Hazards) Regression and have a dataset with several variables. I am trying to decide which variables are the most useful to use as covariates in my model. Which method do you use and recommend for doing this? I thought at first to do a univariate analysis and see which variables don't have a significant survival difference to exclude them but as I understand such procedures have the issue that they don't take into account the interaction between the different variables.
Garcia, R. I., Ibrahim, J. G., & Zhu, H. (2010). Variable selection in the cox regression model with covariates missing at random. Biometrics, 66(1), 97–104. https://doi.org/10.1111/j.1541-0420.2009.01274.x
For a theoretical discussion:
Handbook of Survival Analysis by John P. Klein, CRC Press Taylor & Francis Group
Assume that conditional on a set of covariates, the survival and censoring times are independent. Under this particular dependent censorship model, nonparametric estimators of the marginal hazard and survival function are investigated when some of the covariates are continuous. Consistency of the estimators is established by proving the weak conver...
This paper develops a new sparse Cox regression method for high-dimensional massive sample size survival data. Our method is an $L_0$-based iteratively reweighted $L_2$-penalized Cox regression, which inherits some appealing properties of both $L_0$ and $L_2$-penalized Cox regression while overcoming their limitations. We establish that it has an o...