<?xml version="1.0" encoding="UTF-8" ?>
<!-- RSS generated by PHPBoost on Tue, 14 Apr 2026 06:39:59 +0200 -->

<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title><![CDATA[Easy Guides]]></title>
		<atom:link href="https://www.sthda.com/english/syndication/rss/wiki/47" rel="self" type="application/rss+xml"/>
		<link>https://www.sthda.com</link>
		<description><![CDATA[Last articles of the category: Survival Analysis]]></description>
		<copyright>(C) 2005-2026 PHPBoost</copyright>
		<language>en</language>
		<generator>PHPBoost</generator>
		
		
		<item>
			<title><![CDATA[Survival Analysis]]></title>
			<link>https://www.sthda.com/english/wiki/survival-analysis</link>
			<guid>https://www.sthda.com/english/wiki/survival-analysis</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<p><br/></p>
<p><strong>Survival analysis</strong> corresponds to a set of statistical methods for investigating the time it takes for an event of interest to occur.</p>
<br/>
<div class="block">
<p>In this chapter, we start by describing how to fit survival curves and how to perform logrank tests comparing the survival time of two or more groups of individuals. We continue by demonstrating how to assess simultaneously the impact of multiple risk factors on the survival time using the Cox regression model. Finally, we describe how to check the validy Cox model assumptions.</p>
</div>
<p><br/></p>
<div id="survival-analysis-toolkits-in-r" class="section level2">
<h2>Survival analysis toolkits in R</h2>
<p>We’ll use two R packages for survival data analysis and visualization :</p>
<ol style="list-style-type: decimal">
<li>the <em>survival</em> package for survival analyses,</li>
<li>and the <em>survminer</em> package for ggplot2-based elegant visualization of survival analysis results</li>
</ol>
<p>For survival analyses, the following function [in survival package] will be used:</p>
<ul>
<li><em>Surv</em>() to create a survival object</li>
<li><em>survfit</em>() to fit survival curves (Kaplan-Meier estimates)</li>
<li><em>survdiff</em>() to perform log-rank test comparing survival curves</li>
<li><em>coxph</em>() to compute the Cox proportional hazards model</li>
</ul>
<p>For the visualization, we’ll use the following function available in the survminer package:</p>
<ul>
<li><em>ggsurvplot</em>() for visualizing survival curves</li>
<li><em>ggcoxzph</em>(), <em>ggcoxdiagnostics</em>() and <em>ggcoxfunctional</em>() for checking the Cox model assumptions.</li>
</ul>
<p>These two packages can be installed as follow:</p>
<pre class="r"><code>install.packages("survival")
install.packages("survminer")</code></pre>
</div>
<div id="contents" class="section level2">
<h2>Contents</h2>
<br/>
<div class="block">
<ul>
<li><a href="https://www.sthda.com/english/english/wiki/survival-analysis-basics">Survival Analysis Basics: Curves and Logrank Tests</a></li>
<li><a href="https://www.sthda.com/english/english/wiki/cox-proportional-hazards-model">Cox Proportional Hazards Model</a></li>
<li><a href="https://www.sthda.com/english/english/wiki/cox-model-assumptions">Cox Model Assumptions</a></li>
</ul>
</div>
<p><br/></p>
<hr/>
</div>
<div id="survival-analysis-basics-curves-and-logrank-tests" class="section level2">
<h2>Survival analysis basics: curves and logrank tests</h2>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/survival-analysis-survival-curves-1.png" alt="Survival analysis" width="576" style="margin-bottom:10px;" />
<p class="caption">
Survival analysis
</p>
</div>
<ul>
<li>Objectives</li>
<li>Basic concepts
<ul>
<li>Survival time and type of events in cancer studies</li>
<li>Censoring</li>
<li>Survival and hazard functions</li>
<li>Kaplan-Meier survival estimate</li>
</ul></li>
<li>Survival analysis in R
<ul>
<li>Install and load required R package</li>
<li>Example data sets</li>
<li>Compute survival curves: survfit()</li>
<li>Access to the value returned by survfit()</li>
<li>Visualize survival curves</li>
<li>Kaplan-Meier life table: summary of survival curves</li>
<li>Log-Rank test comparing survival curves: survdiff()</li>
<li>Fit complex survival curves</li>
</ul></li>
</ul>
<p><span class="success">Read more –> <a href="https://www.sthda.com/english/english/wiki/survival-analysis-basics-curves-and-logrank-tests">Survival Analysis Basics: Curves and Logrank Tests</a></span></p>
</div>
<div id="cox-proportional-hazards-model" class="section level2">
<h2>Cox proportional hazards model</h2>
<ul>
<li>The need for multivariate statistical modeling</li>
<li>Basics of the Cox proportional hazards model</li>
<li>Compute the Cox model in R
<ul>
<li>Install and load required R package</li>
<li>R function to compute the Cox model: coxph()</li>
<li>Example data sets</li>
<li>Compute the Cox model</li>
<li>Visualizing the estimated distribution of survival times</li>
</ul></li>
</ul>
<p><span class="success">Read more –> <a href="https://www.sthda.com/english/english/wiki/cox-proportional-hazards-model">Cox Proportional Hazards Model</a>.</span></p>
</div>
<div id="cox-model-assumptions" class="section level2">
<h2>Cox model assumptions</h2>
<ul>
<li>Diagnostics for the Cox model</li>
<li>Assessing the validy of a Cox model in R
<ul>
<li>Installing and loading required R packages</li>
<li>Computing a Cox model</li>
<li>Testing proportional Hazards assumption</li>
<li>Testing influential observations</li>
<li>Testing non linearity</li>
</ul></li>
</ul>
<p><span class="success">Read more –> <a href="https://www.sthda.com/english/english/wiki/cox-model-assumptions">Cox Model Assumptions</a>.</span></p>
</div>
<div id="infos" class="section level2">
<h2>Infos</h2>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.3.2). </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('#rdoc h1').addClass('wiki_paragraph1');
    jQuery('#rdoc h2').addClass('wiki_paragraph2');
    jQuery('#rdoc h3').addClass('wiki_paragraph3');
    jQuery('#rdoc h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Tue, 13 Dec 2016 01:26:54 +0100</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Cox Model Assumptions]]></title>
			<link>https://www.sthda.com/english/wiki/cox-model-assumptions</link>
			<guid>https://www.sthda.com/english/wiki/cox-model-assumptions</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<p><br/></p>
<p>Previously, we described the <a href="https://www.sthda.com/english/english/wiki/survival-analysis-basics">basic methods for analyzing survival data</a>, as well as, the <a href="https://www.sthda.com/english/english/wiki/cox-proportional-hazards-model">Cox proportional hazards methods</a> to deal with the situation where several factors impact on the survival process.</p>
<p>In the current article, we continue the series by describing methods to evaluate the validity of the <strong>Cox model assumptions</strong>.</p>
<p><span class="warning">Note that, when used inappropriately, statistical models may give rise to misleading conclusions. Therefore, it’s important to check that a given model is an appropriate representation of the data.</span></p>
<br/>
<div id="TOC" class = "block">
  <strong>Contents</strong><br/>
<ul>
<li><a href="#diagnostics-for-the-cox-model">Diagnostics for the Cox model</a></li>
<li><a href="#assessing-the-validy-of-a-cox-model-in-r">Assessing the validy of a Cox model in R</a><ul>
<li><a href="#installing-and-loading-required-r-packages">Installing and loading required R packages</a></li>
<li><a href="#computing-a-cox-model">Computing a Cox model</a></li>
<li><a href="#testing-proportional-hazards-assumption">Testing proportional Hazards assumption</a></li>
<li><a href="#testing-influential-observations">Testing influential observations</a></li>
<li><a href="#testing-non-linearity">Testing non linearity</a></li>
</ul></li>
<li><a href="#summary">Summary</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>
<br/>
<div id="diagnostics-for-the-cox-model" class="section level2">
<h2>Diagnostics for the Cox model</h2>
<p>The Cox proportional hazards model makes sevral assumptions. Thus, it is important to assess whether a fitted Cox regression model adequately describes the data.</p>
<p>Here, we’ll disscuss three types of diagonostics for the Cox model:</p>
<ul>
<li>Testing the proportional hazards assumption.</li>
<li>Examining influential observations (or outliers).</li>
<li>Detecting nonlinearity in relationship between the log hazard and the covariates.</li>
</ul>
<p>In order to check these model assumptions, <em>Residuals</em> method are used. The common residuals for the Cox model include:</p>
<ul>
<li><em>Schoenfeld residuals</em> to check the proportional hazards assumption</li>
<li><em>Martingale residual</em> to assess nonlinearity</li>
<li><em>Deviance residual</em> (symmetric transformation of the Martinguale residuals), to examine influential observations</li>
</ul>
</div>



<div id="assessing-the-validy-of-a-cox-model-in-r" class="section level2">
<h2>Assessing the validy of a Cox model in R</h2>
<div id="installing-and-loading-required-r-packages" class="section level3">
<h3>Installing and loading required R packages</h3>
<p>We’ll use two R packages:</p>
<ul>
<li><strong>survival</strong> for computing survival analyses</li>
<li><p><strong>survminer</strong> for visualizing survival analysis results</p></li>
<li><p>Install the packages</p></li>
</ul>
<pre class="r"><code>install.packages(c("survival", "survminer"))</code></pre>
<ul>
<li>Load the packages</li>
</ul>
<pre class="r"><code>library("survival")
library("survminer")</code></pre>
</div>
<div id="computing-a-cox-model" class="section level3">
<h3>Computing a Cox model</h3>
<p>We’ll use the lung data sets and the <em>coxph</em>() function in the survival package.</p>
<p>To compute a Cox model, type this:</p>
<pre class="r"><code>library("survival")
res.cox <- coxph(Surv(time, status) ~ age + sex + wt.loss, data =  lung)
res.cox</code></pre>
<pre><code>Call:
coxph(formula = Surv(time, status) ~ age + sex + wt.loss, data = lung)

            coef exp(coef) se(coef)     z      p
age      0.02009   1.02029  0.00966  2.08 0.0377
sex     -0.52103   0.59391  0.17435 -2.99 0.0028
wt.loss  0.00076   1.00076  0.00619  0.12 0.9024

Likelihood ratio test=14.7  on 3 df, p=0.00212
n= 214, number of events= 152 
   (14 observations deleted due to missingness)</code></pre>
</div>
<div id="testing-proportional-hazards-assumption" class="section level3">
<h3>Testing proportional Hazards assumption</h3>
<p>The proportional hazards (PH) assumption can be checked using statistical tests and graphical diagnostics based on the <em>scaled Schoenfeld residuals</em>.</p>
<p><span class="success">In principle, the <em>Schoenfeld residuals</em> are independent of time. A plot that shows a non-random pattern against time is evidence of violation of the PH assumption.</span></p>
<p>The function <em>cox.zph</em>() [in the <em>survival</em> package] provides a convenient solution to test the proportional hazards assumption for each covariate included in a Cox refression model fit.</p>
<p>For each covariate, the function <em>cox.zph</em>() correlates the corresponding set of scaled Schoenfeld residuals with time, to test for independence between residuals and time. Additionally, it performs a global test for the model as a whole.</p>
<p><span class="success">The proportional hazard assumption is supported by a non-significant relationship between residuals and time, and refuted by a significant relationship.</span></p>
<p>To illustrate the test, we start by computing a Cox regression model using the lung data set [in survival package]:</p>
<pre class="r"><code>library("survival")
res.cox <- coxph(Surv(time, status) ~ age + sex + wt.loss, data =  lung)
res.cox</code></pre>
<pre><code>Call:
coxph(formula = Surv(time, status) ~ age + sex + wt.loss, data = lung)

            coef exp(coef) se(coef)     z      p
age      0.02009   1.02029  0.00966  2.08 0.0377
sex     -0.52103   0.59391  0.17435 -2.99 0.0028
wt.loss  0.00076   1.00076  0.00619  0.12 0.9024

Likelihood ratio test=14.7  on 3 df, p=0.00212
n= 214, number of events= 152 
   (14 observations deleted due to missingness)</code></pre>
<p>To test for the proportional-hazards (PH) assumption, type this:</p>
<pre class="r"><code>test.ph <- cox.zph(res.cox)
test.ph</code></pre>
<pre><code>            rho chisq     p
age     -0.0483 0.378 0.538
sex      0.1265 2.349 0.125
wt.loss  0.0126 0.024 0.877
GLOBAL       NA 2.846 0.416</code></pre>
<p><span class="success">From the output above, the test is not statistically significant for each of the covariates, and the global test is also not statistically significant. Therefore, we can assume the proportional hazards.</span></p>
<p>It’s possible to do a graphical diagnostic using the function <em>ggcoxzph</em>() [in the <em>survminer</em> package], which produces, for each covariate, graphs of the scaled Schoenfeld residuals against the transformed time.</p>
<pre class="r"><code>ggcoxzph(test.ph)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/cox-model-assumptions-scaled-schoenfeld-residuals-1.png" alt="Cox Model Assumptions" width="480" style="margin-bottom:10px;" />
<p class="caption">
Cox Model Assumptions
</p>
</div>
<p>In the figure above, the solid line is a smoothing spline fit to the plot, with the dashed lines representing a +/- 2-standard-error band around the fit.</p>
<p><span class="warning">Note that, systematic departures from a horizontal line are indicative of non-proportional hazards, since proportional hazards assumes that estimates <span class="math inline">\(\beta_1, \beta_2, \beta_3\)</span> do not vary much over time.</span></p>
<p>From the graphical inspection, there is no pattern with time. The assumption of proportional hazards appears to be supported for the covariates sex (which is, recall, a two-level factor, accounting for the two bands in the graph), wt.loss and age.</p>
<p>Another graphical methods for checking proportional hazards is to plot log(-log(S(t))) vs. t or log(t) and look for parallelism. This can be done only for categorical covariates.</p>
<p>A violations of proportional hazards assumption can be resolved by:</p>
<ul>
<li>Adding covariate*time interaction</li>
<li>Stratification</li>
</ul>
<p>Stratification is usefull for “nuisance” confounders, where you do not care to estimate the effect. You cannot examine the effects of the stratification variable (John Fox &amp; Sanford Weisberg).</p>
<p>To read more about how to accomodate with non-proportional hazards, read the following articles:</p>
<ul>
<li>Jadwiga Borucka, PAREXEL, Warsaw, Poland. <a href="http://www.lexjansen.com/phuse/2013/sp/SP07.pdf">Extensions of cox model for non-proportional hazards purpose</a>. 2013.</li>
<li>John Fox &amp; Sanford Weisberg. <a href="https://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix/Appendix-Cox-Regression.pdf">Cox Proportional-Hazards Regression for Survival Data in R</a>.</li>
<li>Max Gordon. <a href="https://www.r-bloggers.com/dealing-with-non-proportional-hazards-in-r/">Dealing with non-proportional hazards in R</a>. March 29, 2016.</li>
</ul>
</div>
<div id="testing-influential-observations" class="section level3">
<h3>Testing influential observations</h3>
<p>To test influential observations or outliers, we can visualize either:</p>
<ul>
<li>the <em>deviance residuals</em> or</li>
<li>the <em>dfbeta</em> values</li>
</ul>
<p>The function <em>ggcoxdiagnostics</em>()[in <em>survminer</em> package] provides a convenient solution for checkind influential observations. The simplified format is as follow:</p>
<pre class="r"><code>ggcoxdiagnostics(fit, type = , linear.predictions = TRUE)</code></pre>
<br/>
<div class="block">
<ul>
<li>fit: an object of class coxph.object</li>
<li>type: the type of residuals to present on Y axis. Allowed values include one of c(“martingale”, “deviance”, “score”, “schoenfeld”, “dfbeta”, “dfbetas”, “scaledsch”, “partial”).</li>
<li>linear.predictions: a logical value indicating whether to show linear predictions for observations (TRUE) or just indexed of observations (FALSE) on X axis.</li>
</ul>
</div>
<p><br/></p>
<p>Specifying the argument <em>type = “dfbeta”</em>, plots the estimated changes in the regression coefficients upon deleting each observation in turn; likewise, <em>type=“dfbetas”</em> produces the estimated changes in the coefficients divided by their standard errors.</p>
<p>For example:</p>
<pre class="r"><code>ggcoxdiagnostics(res.cox, type = "dfbeta",
                 linear.predictions = FALSE, ggtheme = theme_bw())</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/cox-model-assumptions-influential-observations-1.png" alt="Cox Model Assumptions" width="384" style="margin-bottom:10px;" />
<p class="caption">
Cox Model Assumptions
</p>
</div>
<p>(Index plots of dfbeta for the Cox regression of time to death on age, sex and wt.loss)</p>
<p>The above index plots show that comparing the magnitudes of the largest dfbeta values to the regression coefficients suggests that none of the observations is terribly influential individually, even though some of the dfbeta values for age and wt.loss are large compared with the others.</p>
<p>It’s also possible to check outliers by visualizing the deviance residuals. The deviance residual is a normalized transform of the martingale residual. These residuals should be roughtly symmetrically distributed about zero with a standard deviation of 1.</p>
<ul>
<li>Positive values correspond to individuals that “died too soon” compared to expected survival times.</li>
<li>Negative values correspond to individual that “lived too long”.</li>
<li>Very large or small values are outliers, which are poorly predicted by the model.</li>
</ul>
<p>Example of deviance residuals:</p>
<pre class="r"><code>ggcoxdiagnostics(res.cox, type = "deviance",
                 linear.predictions = FALSE, ggtheme = theme_bw())</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/cox-model-assumptions-deviance-residuals-outliers-1.png" alt="Cox Model Assumptions" width="384" style="margin-bottom:10px;" />
<p class="caption">
Cox Model Assumptions
</p>
</div>
<p><span class="success">The pattern looks fairly symmetric around 0.</span></p>
</div>
<div id="testing-non-linearity" class="section level3">
<h3>Testing non linearity</h3>
<p>Often, we assume that continuous covariates have a linear form. However, this assumption should be checked.</p>
<p>Plotting the <em>Martingale residuals</em> against continuous covariates is a common approach used to detect <em>nonlinearity</em> or, in other words, to assess the functional form of a covariate. For a given continuous covariate, patterns in the plot may suggest that the variable is not properly fit.</p>
<p>Nonlinearity is not an issue for categorical variables, so we only examine plots of martingale residuals and partial residuals against a continuous variable.</p>
<p>Martingale residuals may present any value in the range (-INF, +1):</p>
<ul>
<li>A value of martinguale residuals near 1 represents individuals that “died too soon”,</li>
<li>and large negative values correspond to individuals that “lived too long”.</li>
</ul>
<p>To assess the functional form of a continuous variable in a Cox proportional hazards model, we’ll use the function <em>ggcoxfunctional</em>() [in the <em>survminer</em> R package].</p>
<p>The function <em>ggcoxfunctional</em>() displays graphs of continuous covariates against martingale residuals of null cox proportional hazards model. This might help to properly choose the functional form of continuous variable in the Cox model. Fitted lines with lowess function should be linear to satisfy the Cox proportional hazards model assumptions.</p>
<p>For example, to assess the functional forme of age, type this:</p>
<pre class="r"><code>ggcoxfunctional(Surv(time, status) ~ age + log(age) + sqrt(age), data = lung)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/cox-model-assumptions-cox-functional-forme-1.png" alt="Cox Model Assumptions" width="384" style="margin-bottom:10px;" />
<p class="caption">
Cox Model Assumptions
</p>
</div>
<p>It appears that, nonlinearity is slightly here.</p>
</div>
</div>
<div id="summary" class="section level2">
<h2>Summary</h2>
<p>We described how to assess the valididy of the Cox model assumptions using the survival and survminer packages.</p>
</div>
<div id="infos" class="section level2">
<h2>Infos</h2>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.3.2). </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('#rdoc h1').addClass('wiki_paragraph1');
    jQuery('#rdoc h2').addClass('wiki_paragraph2');
    jQuery('#rdoc h3').addClass('wiki_paragraph3');
    jQuery('#rdoc h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>

<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->


<!-- END HTML -->]]></description>
			<pubDate>Tue, 13 Dec 2016 00:03:00 +0100</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Cox Proportional-Hazards Model]]></title>
			<link>https://www.sthda.com/english/wiki/cox-proportional-hazards-model</link>
			<guid>https://www.sthda.com/english/wiki/cox-proportional-hazards-model</guid>
			<description><![CDATA[<!-- START HTML -->

  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">


<p><br/></p>
<p>The <strong>Cox proportional-hazards model</strong> (Cox, 1972) is essentially a regression model commonly used statistical in medical research for investigating the association between the survival time of patients and one or more predictor variables.</p>
<p>In the previous chapter (<a href="https://www.sthda.com/english/english/wiki/survival-analysis-basics">survival analysis basics</a>), we described the basic concepts of survival analyses and methods for analyzing and summarizing survival data, including:</p>
<ul>
<li>the definition of hazard and survival functions,</li>
<li>the construction of Kaplan-Meier survival curves for different patient groups</li>
<li>the logrank test for comparing two or more survival curves</li>
</ul>
<p>The above mentioned methods - Kaplan-Meier curves and logrank tests - are examples of <em>univariate analysis</em>. They describe the survival according to one factor under investigation, but ignore the impact of any others.</p>
<p>Additionally, Kaplan-Meier curves and logrank tests are useful only when the predictor variable is categorical (e.g.: treatment A vs treatment B; males vs females). They don’t work easily for quantitative predictors such as gene expression, weight, or age.</p>
<p>An alternative method is the Cox proportional hazards regression analysis, which works for both quantitative predictor variables and for categorical variables. Furthermore, the Cox regression model extends survival analysis methods to assess simultaneously the effect of several risk factors on survival time.</p>
<p><span class="success">In this article, we’ll describe the Cox regression model and provide practical examples using R software.</span></p>


<div id="TOC" class = "block">
  <strong>Contents</strong><br/>
<ul>
<li><a href="#the-need-for-multivariate-statistical-modeling">The need for multivariate statistical modeling</a></li>
<li><a href="#basics-of-the-cox-proportional-hazards-model">Basics of the Cox proportional hazards model</a></li>
<li><a href="#compute-the-cox-model-in-r">Compute the Cox model in R</a><ul>
<li><a href="#install-and-load-required-r-package">Install and load required R package</a></li>
<li><a href="#r-function-to-compute-the-cox-model-coxph">R function to compute the Cox model: coxph()</a></li>
<li><a href="#example-data-sets">Example data sets</a></li>
<li><a href="#compute-the-cox-model">Compute the Cox model</a></li>
<li><a href="#visualizing-the-estimated-distribution-of-survival-times">Visualizing the estimated distribution of survival times</a></li>
</ul></li>
<li><a href="#summary">Summary</a></li>
<li><a href="#references">References</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<div id="the-need-for-multivariate-statistical-modeling" class="section level2">
<h2>The need for multivariate statistical modeling</h2>
<p>In clinical investigations, there are many situations, where several known quantities (known as <em>covariates</em>), potentially affect patient prognosis.</p>
<p>For instance, suppose two groups of patients are compared: those with and those without a specific genotype. If one of the groups also contains older individuals, any difference in survival may be attributable to genotype or age or indeed both. Hence, when investigating survival in relation to any one factor, it is often desirable to adjust for the impact of others.</p>
<p>Statistical model is a frequently used tool that allows to analyze survival with respect to several factors simultaneously. Additionally, statistical model provides the effect size for each factor.</p>
<p>The cox proportional-hazards model is one of the most important methods used for modelling survival analysis data. The next section introduces the basics of the Cox regression model.</p>
</div>
<div id="basics-of-the-cox-proportional-hazards-model" class="section level2">
<h2>Basics of the Cox proportional hazards model</h2>
<p>The purpose of the model is to evaluate simultaneously the effect of several factors on survival. In other words, it allows us to examine how specified factors influence the rate of a particular event happening (e.g., infection, death) at a particular point in time. This rate is commonly referred as the hazard rate. Predictor variables (or factors) are usually termed <em>covariates</em> in the survival-analysis literature.</p>
<p>The Cox model is expressed by the <em>hazard function</em> denoted by h(t). Briefly, the hazard function can be interpreted as the risk of dying at time t. It can be estimated as follow:</p>
<p><span class="math display">\[
h(t) = h_0(t) \times exp(b_1x_1 + b_2x_2 + ... + b_px_p)
\]</span></p>
<p>where,</p>
<ul>
<li><em>t</em> represents the survival time</li>
<li><span class="math inline">\(h(t)\)</span> is the hazard function determined by a set of p covariates (<span class="math inline">\(x_1, x_2, ..., x_p\)</span>)</li>
<li>the coefficients (<span class="math inline">\(b_1, b_2, ..., b_p\)</span>) measure the impact (i.e., the effect size) of covariates.</li>
<li>the term <span class="math inline">\(h_0\)</span> is called the baseline hazard. It corresponds to the value of the hazard if all the <span class="math inline">\(x_i\)</span> are equal to zero (the quantity exp(0) equals 1). The ‘t’ in h(t) reminds us that the hazard may vary over time.</li>
</ul>
<p>The Cox model can be written as a multiple linear regression of the logarithm of the hazard on the variables <span class="math inline">\(x_i\)</span>, with the baseline hazard being an ‘intercept’ term that varies with time.</p>
<p>The quantities <span class="math inline">\(exp(b_i)\)</span> are called hazard ratios (HR). A value of <span class="math inline">\(b_i\)</span> greater than zero, or equivalently a hazard ratio greater than one, indicates that as the value of the <span class="math inline">\(i^{th}\)</span> covariate increases, the event hazard increases and thus the length of survival decreases.</p>
<p>Put another way, a hazard ratio above 1 indicates a covariate that is positively associated with the event probability, and thus negatively associated with the length of survival.</p>
<p>In summary,</p>
<ul>
<li>HR = 1: No effect</li>
<li>HR < 1: Reduction in the hazard</li>
<li>HR > 1: Increase in Hazard</li>
</ul>
<br/>
<div class="success">
<p>Note that in cancer studies:</p>
<ul>
<li>A covariate with hazard ratio > 1 (i.e.: b > 0) is called bad prognostic factor</li>
<li>A covariate with hazard ratio < 1 (i.e.: b < 0) is called good prognostic factor</li>
</ul>
</div>
<p><br/></p>
<p><span class="warning">A key assumption of the Cox model is that the hazard curves for the groups of observations (or patients) should be proportional and cannot cross.</span></p>
<p>Consider two patients k and k’ that differ in their x-values. The corresponding hazard function can be simply written as follow</p>
<ul>
<li>Hazard function for the patient k:</li>
</ul>
<p><span class="math display">\[
h_k(t) = h_0(t)e^{\sum\limits_{i=1}^n{\beta x}}
\]</span></p>
<ul>
<li>Hazard function for the patient k’:</li>
</ul>
<p><span class="math display">\[
h_{k&amp;#39;}(t) = h_0(t)e^{\sum\limits_{i=1}^n{\beta x&amp;#39;}}
\]</span></p>
<ul>
<li>The hazard ratio for these two patients [<span class="math inline">\(\frac{h_k(t)}{h_{k&amp;#39;}(t)} = \frac{h_0(t)e^{\sum\limits_{i=1}^n{\beta x}}}{h_0(t)e^{\sum\limits_{i=1}^n{\beta x&amp;#39;}}} = \frac{e^{\sum\limits_{i=1}^n{\beta x}}}{e^{\sum\limits_{i=1}^n{\beta x&amp;#39;}}}\)</span>] is independent of time t.</li>
</ul>
<p><span class="warning">Consequently, the Cox model is a <em>proportional-hazards model</em>: the hazard of the event in any group is a constant multiple of the hazard in any other. This assumption implies that, as mentioned above, the hazard curves for the groups should be proportional and cannot cross.</span></p>
<p>In other words, if an individual has a risk of death at some initial time point that is twice as high as that of another individual, then at all later times the risk of death remains twice as high.</p>
<p><span class="error">This assumption of proportional hazards should be tested. We’ll discuss methods for assessing proportionality in the next article in this series: <a href="https://www.sthda.com/english/english/wiki/cox-model-assumptions">Cox Model Assumptions</a>.</span></p>
</div>
<div id="compute-the-cox-model-in-r" class="section level2">
<h2>Compute the Cox model in R</h2>
<div id="install-and-load-required-r-package" class="section level3">
<h3>Install and load required R package</h3>
<p>We’ll use two R packages:</p>
<ul>
<li><strong>survival</strong> for computing survival analyses</li>
<li><p><strong>survminer</strong> for visualizing survival analysis results</p></li>
<li><p>Install the packages</p></li>
</ul>
<pre class="r"><code>install.packages(c("survival", "survminer"))</code></pre>
<ul>
<li>Load the packages</li>
</ul>
<pre class="r"><code>library("survival")
library("survminer")</code></pre>
</div>
<div id="r-function-to-compute-the-cox-model-coxph" class="section level3">
<h3>R function to compute the Cox model: coxph()</h3>
<p>The function <em>coxph</em>()[in <em>survival</em> package] can be used to compute the Cox proportional hazards regression model in R.</p>
<p>The simplified format is as follow:</p>
<pre class="r"><code>coxph(formula, data, method)</code></pre>
<br/>
<div class="block">
<ul>
<li>formula: is linear model with a survival object as the response variable. Survival object is created using the function <em>Surv</em>() as follow: <em>Surv(time, event)</em>.</li>
<li>data: a data frame containing the variables</li>
<li>method: is used to specify how to handle ties. The default is ‘efron’. Other options are ‘breslow’ and ‘exact’. The default ‘efron’ is generally preferred to the once-popular “breslow” method. The “exact” method is much more computationally intensive.</li>
</ul>
</div>
<p><br/></p>
</div>
<div id="example-data-sets" class="section level3">
<h3>Example data sets</h3>
<p>We’ll use the lung cancer data in the survival R package.</p>
<pre class="r"><code>data("lung")
head(lung)</code></pre>
<pre><code>  inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
1    3  306      2  74   1       1       90       100     1175      NA
2    3  455      2  68   1       0       90        90     1225      15
3    3 1010      1  56   1       0       90        90       NA      15
4    5  210      2  57   1       1       90        60     1150      11
5    1  883      2  60   1       0      100        90       NA       0
6   12 1022      1  74   1       1       50        80      513       0</code></pre>
<ul>
<li>inst: Institution code</li>
<li>time: Survival time in days</li>
<li>status: censoring status 1=censored, 2=dead</li>
<li>age: Age in years</li>
<li>sex: Male=1 Female=2</li>
<li>ph.ecog: ECOG performance score (0=good 5=dead)</li>
<li>ph.karno: Karnofsky performance score (bad=0-good=100) rated by physician</li>
<li>pat.karno: Karnofsky performance score as rated by patient</li>
<li>meal.cal: Calories consumed at meals</li>
<li>wt.loss: Weight loss in last six months</li>
</ul>
</div>
<div id="compute-the-cox-model" class="section level3">
<h3>Compute the Cox model</h3>
<p>We’ll fit the Cox regression using the following covariates: age, sex, ph.ecog and wt.loss.</p>
<p>We start by computing univariate Cox analyses for all these variables; then we’ll fit multivariate cox analyses using two variables to describe how the factors jointly impact on survival.</p>
<div id="univariate-cox-regression" class="section level4">
<h4>Univariate Cox regression</h4>
<p>Univariate Cox analyses can be computed as follow:</p>
<pre class="r"><code>res.cox <- coxph(Surv(time, status) ~ sex, data = lung)
res.cox</code></pre>
<pre><code>Call:
coxph(formula = Surv(time, status) ~ sex, data = lung)

      coef exp(coef) se(coef)     z      p
sex -0.531     0.588    0.167 -3.18 0.0015

Likelihood ratio test=10.6  on 1 df, p=0.00111
n= 228, number of events= 165 </code></pre>
<p>The function <em>summary</em>() for Cox models produces a more complete report:</p>
<pre class="r"><code>summary(res.cox)</code></pre>
<pre><code>Call:
coxph(formula = Surv(time, status) ~ sex, data = lung)

  n= 228, number of events= 165 

       coef exp(coef) se(coef)      z Pr(>|z|)   
sex -0.5310    0.5880   0.1672 -3.176  0.00149 **
---
Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1

    exp(coef) exp(-coef) lower .95 upper .95
sex     0.588      1.701    0.4237     0.816

Concordance= 0.579  (se = 0.022 )
Rsquare= 0.046   (max possible= 0.999 )
Likelihood ratio test= 10.63  on 1 df,   p=0.001111
Wald test            = 10.09  on 1 df,   p=0.001491
Score (logrank) test = 10.33  on 1 df,   p=0.001312</code></pre>
<p>The Cox regression results can be interpreted as follow:</p>
<ol style="list-style-type: decimal">
<li><p><em>Statistical significance</em>. The column marked “z” gives the Wald statistic value. It corresponds to the ratio of each regression coefficient to its standard error (z = coef/se(coef)). The wald statistic evaluates, whether the beta (<span class="math inline">\(\beta\)</span>) coefficient of a given variable is statistically significantly different from 0. From the output above, we can conclude that the variable sex have highly statistically significant coefficients.</p></li>
<li><p><em>The regression coefficients</em>. The second feature to note in the Cox model results is the the sign of the regression coefficients (coef). A positive sign means that the hazard (risk of death) is higher, and thus the prognosis worse, for subjects with higher values of that variable. The variable sex is encoded as a numeric vector. 1: male, 2: female. The R summary for the Cox model gives the hazard ratio (HR) for the second group relative to the first group, that is, female versus male. The beta coefficient for sex = -0.53 indicates that females have lower risk of death (lower survival rates) than males, in these data.</p></li>
<li><p><em>Hazard ratios</em>. The exponentiated coefficients (exp(coef) = exp(-0.53) = 0.59), also known as <em>hazard ratios</em>, give the effect size of covariates. For example, being female (sex=2) reduces the hazard by a factor of 0.59, or 41%. Being female is associated with good prognostic.</p></li>
<li><p><em>Confidence intervals of the hazard ratios</em>. The summary output also gives upper and lower 95% confidence intervals for the hazard ratio (exp(coef)), lower 95% bound = 0.4237, upper 95% bound = 0.816.</p></li>
<li><p><em>Global statistical significance of the model</em>. Finally, the output gives p-values for three alternative tests for overall significance of the model: The likelihood-ratio test, Wald test, and score logrank statistics. These three methods are asymptotically equivalent. For large enough N, they will give similar results. For small N, they may differ somewhat. The Likelihood ratio test has better behavior for small sample sizes, so it is generally preferred.</p></li>
</ol>
<p>To apply the univariate coxph function to multiple covariates at once, type this:</p>
<pre class="r"><code>covariates <- c("age", "sex",  "ph.karno", "ph.ecog", "wt.loss")
univ_formulas <- sapply(covariates,
                        function(x) as.formula(paste(&amp;#39;Surv(time, status)~&amp;#39;, x)))
                        
univ_models <- lapply( univ_formulas, function(x){coxph(x, data = lung)})

# Extract data 
univ_results <- lapply(univ_models,
                       function(x){ 
                          x <- summary(x)
                          p.value<-signif(x$wald["pvalue"], digits=2)
                          wald.test<-signif(x$wald["test"], digits=2)
                          beta<-signif(x$coef[1], digits=2);#coeficient beta
                          HR <-signif(x$coef[2], digits=2);#exp(beta)
                          HR.confint.lower <- signif(x$conf.int[,"lower .95"], 2)
                          HR.confint.upper <- signif(x$conf.int[,"upper .95"],2)
                          HR <- paste0(HR, " (", 
                                       HR.confint.lower, "-", HR.confint.upper, ")")
                          res<-c(beta, HR, wald.test, p.value)
                          names(res)<-c("beta", "HR (95% CI for HR)", "wald.test", 
                                        "p.value")
                          return(res)
                          #return(exp(cbind(coef(x),confint(x))))
                         })
res <- t(as.data.frame(univ_results, check.names = FALSE))
as.data.frame(res)</code></pre>
<pre><code>           beta HR (95% CI for HR) wald.test p.value
age       0.019            1 (1-1)       4.1   0.042
sex       -0.53   0.59 (0.42-0.82)        10  0.0015
ph.karno -0.016      0.98 (0.97-1)       7.9   0.005
ph.ecog    0.48        1.6 (1.3-2)        18 2.7e-05
wt.loss  0.0013         1 (0.99-1)      0.05    0.83</code></pre>
<p>The output above shows the regression beta coefficients, the effect sizes (given as hazard ratios) and statistical significance for each of the variables in relation to overall survival. Each factor is assessed through separate univariate Cox regressions.</p>
<br/>
<div class="success">
<p>From the output above,</p>
<ul>
<li><p>The variables sex, age and ph.ecog have highly statistically significant coefficients, while the coefficient for ph.karno is not significant.</p></li>
<li><p>age and ph.ecog have positive beta coefficients, while sex has a negative coefficient. Thus, older age and higher ph.ecog are associated with poorer survival, whereas being female (sex=2) is associated with better survival.</p></li>
</ul>
</div>
<p><br/></p>
<p>Now, we want to describe how the factors jointly impact on survival. To answer to this question, we’ll perform a multivariate Cox regression analysis. As the variable ph.karno is not significant in the univariate Cox analysis, we’ll skip it in the multivariate analysis. We’ll include the 3 factors (sex, age and ph.ecog) into the multivariate model.</p>
</div>
<div id="multivariate-cox-regression-analysis" class="section level4">
<h4>Multivariate Cox regression analysis</h4>
<p>A Cox regression of time to death on the time-constant covariates is specified as follow:</p>
<pre class="r"><code>res.cox <- coxph(Surv(time, status) ~ age + sex + ph.ecog, data =  lung)
summary(res.cox)</code></pre>
<pre><code>Call:
coxph(formula = Surv(time, status) ~ age + sex + ph.ecog, data = lung)

  n= 227, number of events= 164 
   (1 observation deleted due to missingness)

             coef exp(coef)  se(coef)      z Pr(>|z|)    
age      0.011067  1.011128  0.009267  1.194 0.232416    
sex     -0.552612  0.575445  0.167739 -3.294 0.000986 ***
ph.ecog  0.463728  1.589991  0.113577  4.083 4.45e-05 ***
---
Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1

        exp(coef) exp(-coef) lower .95 upper .95
age        1.0111     0.9890    0.9929    1.0297
sex        0.5754     1.7378    0.4142    0.7994
ph.ecog    1.5900     0.6289    1.2727    1.9864

Concordance= 0.637  (se = 0.026 )
Rsquare= 0.126   (max possible= 0.999 )
Likelihood ratio test= 30.5  on 3 df,   p=1.083e-06
Wald test            = 29.93  on 3 df,   p=1.428e-06
Score (logrank) test = 30.5  on 3 df,   p=1.083e-06</code></pre>
<p>The p-value for all three overall tests (likelihood, Wald, and score) are significant, indicating that the model is significant. These tests evaluate the omnibus null hypothesis that all of the betas (<span class="math inline">\(\beta\)</span>) are 0. In the above example, the test statistics are in close agreement, and the omnibus null hypothesis is soundly rejected.</p>
<p>In the multivariate Cox analysis, the covariates sex and ph.ecog remain significant (p < 0.05). However, the covariate age fails to be significant (p = 0.23, which is grater than 0.05).</p>
<p>The p-value for sex is 0.000986, with a hazard ratio HR = exp(coef) = 0.58, indicating a strong relationship between the patients’ sex and decreased risk of death. The hazard ratios of covariates are interpretable as multiplicative effects on the hazard. For example, holding the other covariates constant, being female (sex=2) reduces the hazard by a factor of 0.58, or 42%. We conclude that, being female is associated with good prognostic.</p>
<p>Similarly, the p-value for ph.ecog is 4.45e-05, with a hazard ratio HR = 1.59, indicating a strong relationship between the ph.ecog value and increased risk of death. Holding the other covariates constant, a higher value of ph.ecog is associated with a poor survival.</p>
<p>By contrast, the p-value for age is now p=0.23. The hazard ratio HR = exp(coef) = 1.01, with a 95% confidence interval of 0.99 to 1.03. Because the confidence interval for HR includes 1, these results indicate that age makes a smaller contribution to the difference in the HR after adjusting for the ph.ecog values and patient’s sex, and only trend toward significance. For example, holding the other covariates constant, an additional year of age induce daily hazard of death by a factor of exp(beta) = 1.01, or 1%, which is not a significant contribution.</p>
</div>
</div>
<div id="visualizing-the-estimated-distribution-of-survival-times" class="section level3">
<h3>Visualizing the estimated distribution of survival times</h3>
<p>Having fit a Cox model to the data, it’s possible to visualize the predicted survival proportion at any given point in time for a particular risk group. The function <em>survfit</em>() estimates the survival proportion, by default at the mean values of covariates.</p>
<pre class="r"><code># Plot the baseline survival function
ggsurvplot(survfit(res.cox), color = "#2E9FDF",
           ggtheme = theme_minimal())</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/cox-proportional-hazards-cox-proportional-hazard-1.png" alt="Cox Proportional-Hazards Model" width="384" style="margin-bottom:10px;" />
<p class="caption">
Cox Proportional-Hazards Model
</p>
</div>
<p>We may wish to display how estimated survival depends upon the value of a covariate of interest.</p>
<p>Consider that, we want to assess the impact of the sex on the estimated survival probability. In this case, we construct a new data frame with two rows, one for each value of sex; the other covariates are fixed to their average values (if they are continuous variables) or to their lowest level (if they are discrete variables). For a dummy covariate, the average value is the proportion coded 1 in the data set. This data frame is passed to <em>survfit</em>() via the <em>newdata</em> argument:</p>
<pre class="r"><code># Create the new data  
sex_df <- with(lung,
               data.frame(sex = c(1, 2), 
                          age = rep(mean(age, na.rm = TRUE), 2),
                          ph.ecog = c(1, 1)
                          )
               )
sex_df</code></pre>
<pre><code>  sex      age ph.ecog
1   1 62.44737       1
2   2 62.44737       1</code></pre>
<pre class="r"><code># Survival curves
fit <- survfit(res.cox, newdata = sex_df)
ggsurvplot(fit, conf.int = TRUE, legend.labs=c("Sex=1", "Sex=2"),
           ggtheme = theme_minimal())</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/cox-proportional-hazards-survival-coxph-2-1.png" alt="Cox Proportional-Hazards Model" width="624" style="margin-bottom:10px;" />
<p class="caption">
Cox Proportional-Hazards Model
</p>
</div>
</div>
</div>
<div id="summary" class="section level2">
<h2>Summary</h2>
<p>In this article, we described the Cox regression model for assessing simultaneously the relationship between multiple risk factors and patient’s survival time. We demonstrated how to compute the Cox model using the <em>survival</em> package. Additionally, we described how to visualize the results of the analysis using the <em>survminer</em> package.</p>
</div>
<div id="references" class="section level2">
<h2>References</h2>
<ul>
<li>Cox DR (1972). Regression models and life tables (with discussion). J R Statist Soc B 34: 187–220</li>
<li>MJ Bradburn, TG Clark, SB Love and DG Altman. Survival Analysis Part II: Multivariate data analysis – an introduction to concepts and methods. British Journal of Cancer (2003) 89, 431 – 436</li>
</ul>
</div>
<div id="infos" class="section level2">
<h2>Infos</h2>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.3.2). </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('#rdoc h1').addClass('wiki_paragraph1');
    jQuery('#rdoc h2').addClass('wiki_paragraph2');
    jQuery('#rdoc h3').addClass('wiki_paragraph3');
    jQuery('#rdoc h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>

<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->



<!-- END HTML -->]]></description>
			<pubDate>Mon, 12 Dec 2016 23:45:53 +0100</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Survival Analysis Basics]]></title>
			<link>https://www.sthda.com/english/wiki/survival-analysis-basics</link>
			<guid>https://www.sthda.com/english/wiki/survival-analysis-basics</guid>
			<description><![CDATA[<!-- START HTML -->

            
  <!--====================== start from here when you copy to sthda================-->  
  <div id="rdoc">

<p><br/> <strong>Survival analysis</strong> corresponds to a set of statistical approaches used to investigate the time it takes for an event of interest to occur.</p>
<p><strong>Survival analysis</strong> is used in a variety of field such as:</p>
<ul>
<li><em>Cancer studies</em> for patients survival time analyses,</li>
<li><em>Sociology</em> for “event-history analysis”,</li>
<li>and in <em>engineering</em> for “failure-time analysis”.</li>
</ul>
<p>In cancer studies, typical research questions are like:</p>
<ul>
<li>What is the impact of certain clinical characteristics on patient’s survival</li>
<li>What is the probability that an individual survives 3 years?</li>
<li>Are there differences in survival between groups of patients?</li>
</ul>

<div id="TOC" class = "block">
<strong>Contents</strong><br/>
<ul>
<li><a href="#objectives">Objectives</a></li>
<li><a href="#basic-concepts">Basic concepts</a><ul>
<li><a href="#survival-time-and-type-of-events-in-cancer-studies">Survival time and type of events in cancer studies</a></li>
<li><a href="#censoring">Censoring</a></li>
<li><a href="#survival-and-hazard-functions">Survival and hazard functions</a></li>
<li><a href="#kaplan-meier-survival-estimate">Kaplan-Meier survival estimate</a></li>
</ul></li>
<li><a href="#survival-analysis-in-r">Survival analysis in R</a><ul>
<li><a href="#install-and-load-required-r-package">Install and load required R package</a></li>
<li><a href="#example-data-sets">Example data sets</a></li>
<li><a href="#compute-survival-curves-survfit">Compute survival curves: survfit()</a></li>
<li><a href="#access-to-the-value-returned-by-survfit">Access to the value returned by survfit()</a></li>
<li><a href="#visualize-survival-curves">Visualize survival curves</a></li>
<li><a href="#kaplan-meier-life-table-summary-of-survival-curves">Kaplan-Meier life table: summary of survival curves</a></li>
<li><a href="#log-rank-test-comparing-survival-curves-survdiff">Log-Rank test comparing survival curves: survdiff()</a></li>
<li><a href="#fit-complex-survival-curves">Fit complex survival curves</a></li>
</ul></li>
<li><a href="#summary">Summary</a></li>
<li><a href="#references">References</a></li>
<li><a href="#infos">Infos</a></li>
</ul>
</div>

<div id="objectives" class="section level2">
<h2>Objectives</h2>
<p>The aim of this chapter is to describe the basic concepts of survival analysis. In cancer studies, most of survival analyses use the following methods:</p>
<ul>
<li><em>Kaplan-Meier plots</em> to visualize survival curves</li>
<li><em>Log-rank test</em> to compare the survival curves of two or more groups</li>
<li><em>Cox proportional hazards regression</em> to describe the effect of variables on survival. The Cox model is discussed in the next chapter: <a href="https://www.sthda.com/english/english/wiki/cox-proportional-hazards-model">Cox proportional hazards model</a>.</li>
</ul>
<p>Here, we’ll start by explaining the essential concepts of survival analysis, including:</p>
<ul>
<li>how to generate and interpret survival curves,</li>
<li>and how to quantify and test survival differences between two or more groups of patients.</li>
</ul>
<p>Then, we’ll continue by describing multivariate analysis using <a href="https://www.sthda.com/english/english/wiki/cox-proportional-hazards-model">Cox proportional hazards model</a>.</p>
</div>
<div id="basic-concepts" class="section level2">
<h2>Basic concepts</h2>
<p>Here, we start by defining fundamental terms of survival analysis including:</p>
<ul>
<li>Survival time and event</li>
<li>Censoring</li>
<li>Survival function and hazard function</li>
</ul>
<div id="survival-time-and-type-of-events-in-cancer-studies" class="section level3">
<h3>Survival time and type of events in cancer studies</h3>
<p>There are different types of events, including:</p>
<ul>
<li>Relapse</li>
<li>Progression</li>
<li>Death</li>
</ul>
<p><span class="success"> The time from ‘response to treatment’ (complete remission) to the occurrence of the event of interest is commonly called <em>survival time</em> (or time to event).</span></p>
<p>The two most important measures in cancer studies include: i) the <em>time to death</em>; and ii) the <em>relapse-free survival time</em>, which corresponds to the time between response to treatment and recurrence of the disease. It’s also known as <em>disease-free survival time</em> and <em>event-free survival time</em>.</p>
</div>
<div id="censoring" class="section level3">
<h3>Censoring</h3>
<p>As mentioned above, survival analysis focuses on the expected duration of time until occurrence of an event of interest (relapse or death). However, the event may not be observed for some individuals within the study time period, producing the so-called <em>censored</em> observations.</p>
<p>Censoring may arise in the following ways:</p>
<ol style="list-style-type: decimal">
<li>a patient has not (yet) experienced the event of interest, such as relapse or death, within the study time period;</li>
<li>a patient is lost to follow-up during the study period;</li>
<li>a patient experiences a different event that makes further follow-up impossible.</li>
</ol>
<p>This type of censoring, named <em>right censoring</em>, is handled in survival analysis.</p>
</div>
<div id="survival-and-hazard-functions" class="section level3">
<h3>Survival and hazard functions</h3>
<p>Two related probabilities are used to describe survival data: the <em>survival probability</em> and the <em>hazard probability</em>.</p>
<p>The <em>survival probability</em>, also known as the survivor function <span class="math inline">\(S(t)\)</span>, is the probability that an individual survives from the time origin (e.g. diagnosis of cancer) to a specified future time t.</p>
<p>The <em>hazard</em>, denoted by <span class="math inline">\(h(t)\)</span>, is the probability that an individual who is under observation at a time t has an event at that time.</p>
<p><span class="success">Note that, in contrast to the survivor function, which focuses on not having an event, the hazard function focuses on the event occurring.</span></p>
</div>
<div id="kaplan-meier-survival-estimate" class="section level3">
<h3>Kaplan-Meier survival estimate</h3>
<p>The Kaplan-Meier (KM) method is a non-parametric method used to estimate the survival probability from observed survival times (Kaplan and Meier, 1958).</p>
<p>The survival probability at time <span class="math inline">\(t_i\)</span>, <span class="math inline">\(S(t_i)\)</span>, is calculated as follow:</p>
<p><span class="math display">\[S(t_i) = S(t_{i-1})(1-\frac{d_i}{n_i})\]</span></p>
<p>Where,</p>
<ul>
<li><span class="math inline">\(S(t_{i-1})\)</span> = the probability of being alive at <span class="math inline">\(t_{i-1}\)</span></li>
<li><span class="math inline">\(n_i\)</span> = the number of patients alive just before <span class="math inline">\(t_i\)</span></li>
<li><span class="math inline">\(d_i\)</span> = the number of events at <span class="math inline">\(t_i\)</span></li>
<li><span class="math inline">\(t_0\)</span> = 0, <span class="math inline">\(S(0)\)</span> = 1</li>
</ul>
<p>The estimated probability (<span class="math inline">\(S(t)\)</span>) is a step function that changes value only at the time of each event. It’s also possible to compute confidence intervals for the survival probability.</p>
<p>The KM survival curve, a plot of the KM survival probability against time, provides a useful summary of the data that can be used to estimate measures such as median survival time.</p>
</div>
</div>
<div id="survival-analysis-in-r" class="section level2">
<h2>Survival analysis in R</h2>
<div id="install-and-load-required-r-package" class="section level3">
<h3>Install and load required R package</h3>
<p>We’ll use two R packages:</p>
<ul>
<li><em>survival</em> for computing survival analyses</li>
<li><p><em>survminer</em> for summarizing and visualizing the results of survival analysis</p></li>
<li><p>Install the packages</p></li>
</ul>
<pre class="r"><code>install.packages(c("survival", "survminer"))</code></pre>
<ul>
<li>Load the packages</li>
</ul>
<pre class="r"><code>library("survival")
library("survminer")</code></pre>
</div>
<div id="example-data-sets" class="section level3">
<h3>Example data sets</h3>
<p>We’ll use the lung cancer data available in the survival package.</p>
<pre class="r"><code>data("lung")
head(lung)</code></pre>
<pre><code>  inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
1    3  306      2  74   1       1       90       100     1175      NA
2    3  455      2  68   1       0       90        90     1225      15
3    3 1010      1  56   1       0       90        90       NA      15
4    5  210      2  57   1       1       90        60     1150      11
5    1  883      2  60   1       0      100        90       NA       0
6   12 1022      1  74   1       1       50        80      513       0</code></pre>
<ul>
<li>inst: Institution code</li>
<li>time: Survival time in days</li>
<li>status: censoring status 1=censored, 2=dead</li>
<li>age: Age in years</li>
<li>sex: Male=1 Female=2</li>
<li>ph.ecog: ECOG performance score (0=good 5=dead)</li>
<li>ph.karno: Karnofsky performance score (bad=0-good=100) rated by physician</li>
<li>pat.karno: Karnofsky performance score as rated by patient</li>
<li>meal.cal: Calories consumed at meals</li>
<li>wt.loss: Weight loss in last six months</li>
</ul>
</div>
<div id="compute-survival-curves-survfit" class="section level3">
<h3>Compute survival curves: survfit()</h3>
<p><span class="question">We want to compute the survival probability by sex.</span></p>
<p>The function <em>survfit</em>() [in <em>survival</em> package] can be used to compute kaplan-Meier survival estimate. Its main arguments include:</p>
<ul>
<li>a survival object created using the function <em>Surv</em>()</li>
<li>and the data set containing the variables.</li>
</ul>
<p>To compute survival curves, type this:</p>
<pre class="r"><code>fit <- survfit(Surv(time, status) ~ sex, data = lung)
print(fit)</code></pre>
<pre><code>Call: survfit(formula = Surv(time, status) ~ sex, data = lung)

        n events median 0.95LCL 0.95UCL
sex=1 138    112    270     212     310
sex=2  90     53    426     348     550</code></pre>
<p><span class="success">By default, the function print() shows a short summary of the survival curves. It prints the number of observations, number of events, the median survival and the confidence limits for the median. </span></p>
<p>If you want to display a more complete summary of the survival curves, type this:</p>
<pre class="r"><code># Summary of survival curves
summary(fit)

# Access to the sort summary table
summary(fit)$table</code></pre>
</div>
<div id="access-to-the-value-returned-by-survfit" class="section level3">
<h3>Access to the value returned by survfit()</h3>
<p>The function <em>survfit</em>() returns a list of variables, including the following components:</p>
<br/>
<div class="block">
<ul>
<li>n: total number of subjects in each curve.</li>
<li>time: the time points on the curve.</li>
<li>n.risk: the number of subjects at risk at time t</li>
<li>n.event: the number of events that occurred at time t.</li>
<li>n.censor: the number of censored subjects, who exit the risk set, without an event, at time t.</li>
<li>lower,upper: lower and upper confidence limits for the curve, respectively.</li>
<li>strata: indicates stratification of curve estimation. If strata is not NULL, there are multiple curves in the result. The levels of strata (a factor) are the labels for the curves.</li>
</ul>
</div>
<p>The components can be accessed as follow:</p>
<pre class="r"><code>d <- data.frame(time = fit$time,
                  n.risk = fit$n.risk,
                  n.event = fit$n.event,
                  n.censor = fit$n.censor,
                  surv = fit$surv,
                  upper = fit$upper,
                  lower = fit$lower
                  )
head(d)</code></pre>
<pre><code>  time n.risk n.event n.censor      surv     upper     lower
1   11    138       3        0 0.9782609 1.0000000 0.9542301
2   12    135       1        0 0.9710145 0.9994124 0.9434235
3   13    134       2        0 0.9565217 0.9911586 0.9230952
4   15    132       1        0 0.9492754 0.9866017 0.9133612
5   26    131       1        0 0.9420290 0.9818365 0.9038355
6   30    130       1        0 0.9347826 0.9768989 0.8944820</code></pre>
</div>
<div id="visualize-survival-curves" class="section level3">
<h3>Visualize survival curves</h3>
<p>We’ll use the function <em>ggsurvplot</em>() [in <em>Survminer</em> R package] to produce the survival curves for the two groups of subjects.</p>
<p>It’s also possible to show:</p>
<ul>
<li>the 95% <em>confidence limits</em> of the survivor function using the argument <em>conf.int = TRUE</em>.</li>
<li>the number and/or the percentage of <em>individuals at risk</em> by time using the option <em>risk.table</em>. Allowed values for <em>risk.table</em> include:
<ul>
<li>TRUE or FALSE specifying whether to show or not the risk table. Default is FALSE.</li>
<li>“absolute” or “percentage”: to show the <em>absolute number</em> and the <em>percentage</em> of subjects at risk by time, respectively. Use “abs_pct” to show both absolute number and percentage.</li>
</ul></li>
<li>the <em>p-value</em> of the Log-Rank test comparing the groups using <em>pval = TRUE</em>.</li>
<li>horizontal/vertical line at <em>median survival</em> using the argument <em>surv.median.line</em>. Allowed values include one of c(“none”, “hv”, “h”, “v”). v: vertical, h:horizontal.</li>
</ul>
<pre class="r"><code># Change color, linetype by strata, risk.table color by strata
ggsurvplot(fit,
          pval = TRUE, conf.int = TRUE,
          risk.table = TRUE, # Add risk table
          risk.table.col = "strata", # Change risk table color by groups
          linetype = "strata", # Change line type by groups
          surv.median.line = "hv", # Specify median survival
          ggtheme = theme_bw(), # Change ggplot2 theme
          palette = c("#E7B800", "#2E9FDF"))</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/survival-analysis-basics-survival-curves-1.png" alt="Survival Analysis" width="576" style="margin-bottom:10px;" />
<p class="caption">
Survival Analysis
</p>
</div>
<p>The plot can be further customized using the following arguments:</p>
<ul>
<li><em>conf.int.style = “step”</em> to change the style of confidence interval bands.</li>
<li><em>xlab</em> to change the x axis label.</li>
<li><em>break.time.by = 200</em> break x axis in time intervals by 200.</li>
<li><em>risk.table = “abs_pct”</em>to show both absolute number and percentage of individuals at risk.</li>
<li><em>risk.table.y.text.col = TRUE</em> and <em>risk.table.y.text = FALSE</em> to provide bars instead of names in text annotations of the legend of risk table.

</li>
<li><em>ncensor.plot = TRUE</em> to plot the number of censored subjects at time t. As suggested by <a href="https://github.com/kassambara/survminer/issues/18">Marcin Kosinski</a>, This is a good additional feedback to survival curves, so that one could realize: how do survival curves look like, what is the number of risk set AND what is the cause that the risk set become smaller: is it caused by events or by censored events?</li>
<li><em>legend.labs</em> to change the legend labels.</li>
</ul>
<pre class="r"><code>ggsurvplot(
   fit,                     # survfit object with calculated statistics.
   pval = TRUE,             # show p-value of log-rank test.
   conf.int = TRUE,         # show confidence intervals for 
                            # point estimaes of survival curves.
   conf.int.style = "step",  # customize style of confidence intervals
   xlab = "Time in days",   # customize X axis label.
   break.time.by = 200,     # break X axis in time intervals by 200.
   ggtheme = theme_light(), # customize plot and risk table with a theme.
   risk.table = "abs_pct",  # absolute number and percentage at risk.
  risk.table.y.text.col = T,# colour risk table text annotations.
  risk.table.y.text = FALSE,# show bars instead of names in text annotations
                            # in legend of risk table.
  ncensor.plot = TRUE,      # plot the number of censored subjects at time t
  surv.median.line = "hv",  # add the median survival pointer.
  legend.labs = 
    c("Male", "Female"),    # change legend labels.
  palette = 
    c("#E7B800", "#2E9FDF") # custom color palettes.
)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/survival-analysis-basics-customized-survival-plot-1.png" alt="Survival Analysis" width="672" style="margin-bottom:10px;" />
<p class="caption">
Survival Analysis
</p>
</div>
<p>The Kaplan-Meier plot can be interpreted as follow:</p>
<br/>
<div class="block">
<p>The horizontal axis (x-axis) represents time in days, and the vertical axis (y-axis) shows the probability of surviving or the proportion of people surviving. The lines represent survival curves of the two groups. A vertical drop in the curves indicates an event. The vertical tick mark on the curves means that a patient was censored at this time.</p>
<ul>
<li>At time zero, the survival probability is 1.0 (or 100% of the participants are alive).</li>
<li>At time 250, the probability of survival is approximately 0.55 (or 55%) for sex=1 and 0.75 (or 75%) for sex=2.</li>
<li>The median survival is approximately 270 days for sex=1 and 426 days for sex=2, suggesting a good survival for sex=2 compared to sex=1</li>
</ul>
</div>
<p><br/></p>
<p>The median survival times for each group can be obtained using the code below:</p>
<pre class="r"><code>summary(fit)$table</code></pre>
<pre><code>      records n.max n.start events   *rmean *se(rmean) median 0.95LCL 0.95UCL
sex=1     138   138     138    112 325.0663   22.59845    270     212     310
sex=2      90    90      90     53 458.2757   33.78530    426     348     550</code></pre>
<p><span class="success"> The median survival times for each group represent the time at which the survival probability, S(t), is 0.5.</span></p>
<p>The median survival time for sex=1 (Male group) is 270 days, as opposed to 426 days for sex=2 (Female). There appears to be a survival advantage for female with lung cancer compare to male. However, to evaluate whether this difference is statistically significant requires a formal statistical test, a subject that is discussed in the next sections.</p>
<p><span class="warning">Note that, the confidence limits are wide at the tail of the curves, making meaningful interpretations difficult. This can be explained by the fact that, in practice, there are usually patients who are lost to follow-up or alive at the end of follow-up. Thus, it may be sensible to shorten plots before the end of follow-up on the x-axis (Pocock et al, 2002).</span></p>
<p>The survival curves can be shorten using the argument <em>xlim</em> as follow:</p>
<pre class="r"><code>ggsurvplot(fit,
          conf.int = TRUE,
          risk.table.col = "strata", # Change risk table color by groups
          ggtheme = theme_bw(), # Change ggplot2 theme
          palette = c("#E7B800", "#2E9FDF"),
          xlim = c(0, 600))</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/survival-analysis-basics-survival-curves-shorten-1.png" alt="Survival Analysis" width="576" style="margin-bottom:10px;" />
<p class="caption">
Survival Analysis
</p>
</div>
<br/>
<div class="warning">
<p>Note that, three often used transformations can be specified using the argument <em>fun</em>:</p>
<ul>
<li>“log”: log transformation of the survivor function,</li>
<li>“event”: plots cumulative events (f(y) = 1-y). It’s also known as the cumulative incidence,</li>
<li>“cumhaz” plots the cumulative hazard function (f(y) = -log(y))</li>
</ul>
</div>
<p><br/></p>
<p>For example, to plot cumulative events, type this:</p>
<pre class="r"><code>ggsurvplot(fit,
          conf.int = TRUE,
          risk.table.col = "strata", # Change risk table color by groups
          ggtheme = theme_bw(), # Change ggplot2 theme
          palette = c("#E7B800", "#2E9FDF"),
          fun = "event")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/survival-analysis-basics-cumulative-events-1.png" alt="Survival Analysis" width="576" style="margin-bottom:10px;" />
<p class="caption">
Survival Analysis
</p>
</div>
<p>The <strong>cummulative hazard</strong> is commonly used to estimate the hazard probability. It’s defined as <span class="math inline">\(H(t) = -log(survival function) = -log(S(t))\)</span>. The cumulative hazard (<span class="math inline">\(H(t)\)</span>) can be interpreted as the cumulative force of mortality. In other words, it corresponds to the number of events that would be expected for each individual by time t if the event were a repeatable process.</p>
<p>To plot cumulative hazard, type this:</p>
<pre class="r"><code>ggsurvplot(fit,
          conf.int = TRUE,
          risk.table.col = "strata", # Change risk table color by groups
          ggtheme = theme_bw(), # Change ggplot2 theme
          palette = c("#E7B800", "#2E9FDF"),
          fun = "cumhaz")</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/survival-analysis-basics-cumulative-hazard-1.png" alt="Survival Analysis" width="576" style="margin-bottom:10px;" />
<p class="caption">
Survival Analysis
</p>
</div>
</div>
<div id="kaplan-meier-life-table-summary-of-survival-curves" class="section level3">
<h3>Kaplan-Meier life table: summary of survival curves</h3>
<p>As mentioned above, you can use the function <em>summary</em>() to have a complete summary of survival curves:</p>
<pre class="r"><code>summary(fit)</code></pre>
<p>It’s also possible to use the function <em>surv_summary</em>() [in <em>survminer</em> package] to get a summary of survival curves. Compared to the default summary() function, surv_summary() creates a data frame containing a nice summary from survfit results.</p>
<pre class="r"><code>res.sum <- surv_summary(fit)
head(res.sum)</code></pre>
<pre><code>  time n.risk n.event n.censor      surv    std.err     upper     lower strata sex
1   11    138       3        0 0.9782609 0.01268978 1.0000000 0.9542301  sex=1   1
2   12    135       1        0 0.9710145 0.01470747 0.9994124 0.9434235  sex=1   1
3   13    134       2        0 0.9565217 0.01814885 0.9911586 0.9230952  sex=1   1
4   15    132       1        0 0.9492754 0.01967768 0.9866017 0.9133612  sex=1   1
5   26    131       1        0 0.9420290 0.02111708 0.9818365 0.9038355  sex=1   1
6   30    130       1        0 0.9347826 0.02248469 0.9768989 0.8944820  sex=1   1</code></pre>
<p>The function <em>surv_summary</em>() returns a data frame with the following columns:</p>
<ul>
<li>time: the time points at which the curve has a step.</li>
<li>n.risk: the number of subjects at risk at t.</li>
<li>n.event: the number of events that occur at time t.</li>
<li>n.censor: number of censored events.</li>
<li>surv: estimate of survival probability.</li>
<li>std.err: standard error of survival.</li>
<li>upper: upper end of confidence interval</li>
<li>lower: lower end of confidence interval</li>
<li>strata: indicates stratification of curve estimation. The levels of strata (a factor) are the labels for the curves.</li>
</ul>
<p>In a situation, where survival curves have been fitted with one or more variables, surv_summary object contains extra columns representing the variables. This makes it possible to facet the output of ggsurvplot by strata or by some combinations of factors.</p>
<p><em>surv_summary</em> object has also an attribute named ‘table’ containing information about the survival curves, including medians of survival with confidence intervals, as well as, the total number of subjects and the number of event in each curve. To get access to the attribute ‘table’, type this:</p>
<pre class="r"><code>attr(res.sum, "table")</code></pre>
</div>
<div id="log-rank-test-comparing-survival-curves-survdiff" class="section level3">
<h3>Log-Rank test comparing survival curves: survdiff()</h3>
<p>The <em>log-rank test</em> is the most widely used method of comparing two or more survival curves. The null hypothesis is that there is no difference in survival between the two groups. The log rank test is a non-parametric test, which makes no assumptions about the survival distributions. Essentially, the log rank test compares the observed number of events in each group to what would be expected if the null hypothesis were true (i.e., if the survival curves were identical). The log rank statistic is approximately distributed as a chi-square test statistic.</p>
<p>The function <em>survdiff</em>() [in <em>survival</em> package] can be used to compute <em>log-rank test</em> comparing two or more survival curves.</p>
<p><em>survdiff</em>() can be used as follow:</p>
<pre class="r"><code>surv_diff <- survdiff(Surv(time, status) ~ sex, data = lung)
surv_diff</code></pre>
<pre><code>Call:
survdiff(formula = Surv(time, status) ~ sex, data = lung)

        N Observed Expected (O-E)^2/E (O-E)^2/V
sex=1 138      112     91.6      4.55      10.3
sex=2  90       53     73.4      5.68      10.3

 Chisq= 10.3  on 1 degrees of freedom, p= 0.00131 </code></pre>
<p>The function returns a list of components, including:</p>
<ul>
<li>n: the number of subjects in each group.</li>
<li>obs: the weighted observed number of events in each group.</li>
<li>exp: the weighted expected number of events in each group.</li>
<li>chisq: the chisquare statistic for a test of equality.</li>
<li>strata: optionally, the number of subjects contained in each stratum.</li>
</ul>
<p><span class="success"> The log rank test for difference in survival gives a p-value of p = 0.0013, indicating that the sex groups differ significantly in survival. </span></p>
</div>
<div id="fit-complex-survival-curves" class="section level3">
<h3>Fit complex survival curves</h3>
<p>In this section, we’ll compute survival curves using the combination of multiple factors. Next, we’ll facet the output of ggsurvplot() by a combination of factors</p>
<ol style="list-style-type: decimal">
<li>Fit (complex) survival curves using colon data sets</li>
</ol>
<pre class="r"><code>require("survival")
fit2 <- survfit( Surv(time, status) ~ sex + rx + adhere,
                data = colon )</code></pre>
<ol start="2" style="list-style-type: decimal">
<li>Visualize the output using survminer. The plot below shows survival curves by the sex variable faceted according to the values of rx &amp; adhere.</li>
</ol>
<pre class="r"><code># Plot survival curves by sex and facet by rx and adhere
ggsurv <- ggsurvplot(fit2, fun = "event", conf.int = TRUE,
                     ggtheme = theme_bw())
   
ggsurv$plot +theme_bw() + 
  theme (legend.position = "right")+
  facet_grid(rx ~ adhere)</code></pre>
<div class="figure">
<img src="https://www.sthda.com/english/sthda/RDoc/figure/survival-analysis/survival-analysis-basics-complexe-survival-curves-1.png" alt="Survival Analysis" width="652.8" style="margin-bottom:10px;" />
<p class="caption">
Survival Analysis
</p>
</div>
</div>
</div>
<div id="summary" class="section level2">
<h2>Summary</h2>
<p>Survival analysis is a set of statistical approaches for data analysis where the outcome variable of interest is time until an event occurs.</p>
<p>Survival data are generally described and modeled in terms of two related functions:</p>
<ul>
<li><p>the survivor function representing the probability that an individual survives from the time of origin to some time beyond time t. It’s usually estimated by the Kaplan-Meier method. The logrank test may be used to test for differences between survival curves for groups, such as treatment arms.</p></li>
<li><p>The hazard function gives the instantaneous potential of having an event at a time, given survival up to that time. It is used primarily as a diagnostic tool or for specifying a mathematical model for survival analysis.</p></li>
</ul>
<p>In this article, we demonstrate how to perform and visualize survival analyses using the combination of two R packages: <em>survival</em> (for the analysis) and <em>survminer</em> (for the visualization).</p>
</div>
<div id="references" class="section level2">
<h2>References</h2>
<ul>
<li>Clark TG, Bradburn MJ, Love SB and Altman DG. Survival Analysis Part I: Basic concepts and first analyses. British Journal of Cancer (2003) 89, 232 – 238</li>
<li>Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53: 457–481.</li>
<li>Pocock S, Clayton TC, Altman DG (2002) Survival plots of time-to-event outcomes in clinical trials: good practice and pitfalls. Lancet 359: 1686– 1689.</li>
</ul>
</div>
<div id="infos" class="section level2">
<h2>Infos</h2>
<p><span class="warning"> This analysis has been performed using <strong>R software</strong> (ver. 3.3.2). </span></p>
</div>

<script>jQuery(document).ready(function () {
    jQuery('#rdoc h1').addClass('wiki_paragraph1');
    jQuery('#rdoc h2').addClass('wiki_paragraph2');
    jQuery('#rdoc h3').addClass('wiki_paragraph3');
    jQuery('#rdoc h4').addClass('wiki_paragraph4');
    });//add phpboost class to header</script>
<style>.content{padding:0px;}</style>

<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>
</div><!--end rdoc-->
<!--====================== stop here when you copy to sthda================-->

<!-- END HTML -->]]></description>
			<pubDate>Mon, 12 Dec 2016 20:27:40 +0100</pubDate>
			
		</item>
		
	</channel>
</rss>
