Articles - R Conference

Stochastic Gradient Descent Log-Likelihood Estimation in the Cox Proportional Hazards Model

  |   1280  |  Post a comment  |  R Conference  |  useR 2017, Coxph, Survival analysis
In the last decade, the volume of data have grown faster then the speed of processors. In this situation the statistical machine learnig methods have become more limited by the computations time than the volume of datasets.

Compromise solutions in the case of large scale data are associated with the computational complexity of optimization methods, which must be made in a non-trivial way.

One of such solutions are optimization algorithms that are basen on a stochastic gradient descent (Bottou (2010), Bottou (2012), Widrow (1960)), which exhibit a high efficiency during operations on the data of a large scale.

In my presentation I will describe the stochastic gradient descent algorithm that was applied in the log- likelihood estimation process of coefficients' calcualtions of the Cox proportional hazards model.

This algorithm can be successfully used in a time to event analyzes, in which which the number of explanatory variables significantly exceeds the number of observations.

The prepared method of estimation of coefficients with the usage of a stochastic gradient decent can be applied in survival analyzes from ares like: molecular biology, bioinformatical screenings of gene expressions or analyzes based on DNA microarrays, that are widely used in the clinical diagnostics, treatment and research.

The created estimation workflow was a new approach (in the time I wrote my master thesis), not known in the literature. It's resistant to the problem of variables collinearity and works well in situations of continuous coefficients improvement for a streaming data.



Source: useR 2017