10th Apr, 2022

Malla Reddi Institute of Medical Sciences

Question

Asked 10th Apr, 2022

Dear biostats community,

I am trying to build a Cox (Proportional Hazards) Regression and have a dataset with several variables. I am trying to decide which variables are the most useful to use as covariates in my model. Which method do you use and recommend for doing this? I thought at first to do a univariate analysis and see which variables don't have a significant survival difference to exclude them but as I understand such procedures have the issue that they don't take into account the interaction between the different variables.

Thank you very much!!

Gabriel

**Get help with your research**

Join ResearchGate to ask questions, get input, and advance your work.

Hi,

Intuitively selecting based upon already associated covariates from literature. another way is step-wise selection.

This is a reference link for different methods which can be used.:

and

Garcia, R. I., Ibrahim, J. G., & Zhu, H. (2010). Variable selection in the cox regression model with covariates missing at random. *Biometrics*, *66*(1), 97–104. https://doi.org/10.1111/j.1541-0420.2009.01274.x

For a theoretical discussion:

Handbook of Survival Analysis by John P. Klein, CRC Press Taylor & Francis Group

1 Recommendation

Stepwise would be a terrible choice. I recommend lasso or the adaptive lasso version. See the attached papers. Best wishes, David Booth

- 349.71 KBjds_G-Y1.pdf
- 941.73 KBBoosting_and_lassoing_new_prostate_cancer_SNP_risk-2.pdf

Article

- Oct 1989

Assume that conditional on a set of covariates, the survival and censoring times are independent. Under this particular dependent censorship model, nonparametric estimators of the marginal hazard and survival function are investigated when some of the covariates are continuous. Consistency of the estimators is established by proving the weak conver...

Article

- Dec 2017

This paper develops a new sparse Cox regression method for high-dimensional massive sample size survival data. Our method is an $L_0$-based iteratively reweighted $L_2$-penalized Cox regression, which inherits some appealing properties of both $L_0$ and $L_2$-penalized Cox regression while overcoming their limitations. We establish that it has an o...

Get high-quality answers from experts.