--- title: "Survival Analysis from Scratch" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Survival Analysis from Scratch} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(femtograd) ``` ## Introduction This vignette demonstrates femtograd's compositional philosophy: rather than providing a pre-built survival analysis function, we build one from primitives. This shows how the log-likelihood functions, parameter transforms, and inference tools compose to solve real statistical problems. We cover: 1. Fitting a Weibull survival model 2. Handling right-censored data 3. Building a custom censored log-likelihood 4. Adding covariate effects (accelerated failure time model) 5. Full inference: standard errors, CIs, hypothesis tests ## Parametric Survival: The Weibull Model The Weibull distribution is a workhorse of survival analysis. Its hazard function can model increasing, decreasing, or constant risk: - $k > 1$: increasing hazard (aging, wear-out) - $k = 1$: constant hazard (exponential, memoryless) - $k < 1$: decreasing hazard (infant mortality, burn-in) ### Fitting Without Censoring For complete (uncensored) data, we can use the built-in `loglik_weibull()`: ```{r} # Simulate survival times from Weibull(shape=1.5, scale=10) set.seed(42) n <- 100 true_shape <- 1.5 true_scale <- 10 times <- rweibull(n, shape = true_shape, scale = true_scale) # Fit Weibull model (both parameters positive -> log-transform) weibull_fit <- fit( function(log_k, log_lambda) { k <- exp(log_k) lambda <- exp(log_lambda) loglik_weibull(k, lambda, times) }, params = c(log_k = 0, log_lambda = 1) ) # Estimates on original scale estimates <- exp(coef(weibull_fit)) names(estimates) <- c("shape", "scale") estimates ``` The estimates should be close to the true values (shape = `r true_shape`, scale = `r true_scale`). ```{r} summary(weibull_fit) ``` ## Handling Right Censoring In real survival studies, some subjects haven't experienced the event by the end of observation (right censoring). The log-likelihood for censored data is: $$\ell(\theta) = \sum_{i: \delta_i = 1} \log f(t_i; \theta) + \sum_{i: \delta_i = 0} \log S(t_i; \theta)$$ where $\delta_i = 1$ for observed events and $\delta_i = 0$ for censored observations. $f$ is the density and $S$ is the survival function. For the Weibull with shape $k$ and scale $\lambda$: - Density: $f(t) = \frac{k}{\lambda}\left(\frac{t}{\lambda}\right)^{k-1} \exp\left[-\left(\frac{t}{\lambda}\right)^k\right]$ - Survival: $S(t) = \exp\left[-\left(\frac{t}{\lambda}\right)^k\right]$ - Log-survival: $\log S(t) = -\left(\frac{t}{\lambda}\right)^k$ Let's build a custom censored Weibull log-likelihood: ```{r} weibull_censored_loglik <- function(log_k, log_lambda, times, status) { k <- exp(log_k) lambda <- exp(log_lambda) # For observed events: log f(t) = log(k) - k*log(lambda) + (k-1)*log(t) # - (t/lambda)^k # For censored: log S(t) = -(t/lambda)^k # # Combined: # event_term: d*log(k) - d*k*log(lambda) + (k-1)*sum_log_t_events # surv_term: -sum( exp(k * (log(t_i) - log(lambda))) ) [all obs] n_events <- sum(status) sum_log_t_events <- sum(log(times[status == 1])) # Event contribution (only uncensored observations) event_term <- log(k) * n_events - k * log(lambda) * n_events + (k - 1) * sum_log_t_events # Survival contribution for ALL observations: # (t/lambda)^k = exp(k * (log(t) - log(lambda))) # We use this form because log(t) is plain data (numeric), # and k, log(lambda) are parameters tracked by AD. log_times <- log(times) surv_term <- k * 0 # creates a zero of the correct type (value or dual) for (i in seq_along(times)) { surv_term <- surv_term - exp(k * (log_times[i] - log(lambda))) } event_term + surv_term } ``` Note the use of `exp(k * (log(t) - log(lambda)))` instead of `(t/lambda)^k`. This avoids creating temporary `val()` objects for data, which is important for compatibility with femtograd's Hessian computation (which uses dual numbers internally). ### Simulating Censored Data ```{r} set.seed(123) n <- 150 # True survival times from Weibull true_times <- rweibull(n, shape = 1.5, scale = 10) # Censoring times (uniform) censor_times <- runif(n, min = 5, max = 20) # Observed data obs_times <- pmin(true_times, censor_times) status <- as.integer(true_times <= censor_times) cat("Events:", sum(status), "/ Censored:", sum(1 - status), "\n") ``` ### Fitting the Censored Model ```{r} censored_fit <- fit( function(log_k, log_lambda) { weibull_censored_loglik(log_k, log_lambda, obs_times, status) }, params = c(log_k = 0, log_lambda = 1) ) # Estimates on original scale est <- exp(coef(censored_fit)) names(est) <- c("shape", "scale") est ``` ```{r} # CIs on original scale exp(confint(censored_fit)) ``` ## Testing Shape = 1 (Is Exponential Adequate?) A natural question: is the simpler exponential model sufficient? This corresponds to testing $H_0: k = 1$ (i.e., $\log k = 0$). ```{r} # Wald test wt <- wald_test(censored_fit, parm = "log_k", null_value = 0) wt ``` We can also use a likelihood ratio test by fitting the nested exponential model: ```{r} # Exponential model (shape fixed at 1, i.e., log_k = 0) exp_censored_fit <- fit( function(log_lambda) { weibull_censored_loglik(0, log_lambda, obs_times, status) }, params = c(log_lambda = 1) ) # LRT anova(exp_censored_fit, censored_fit) ``` ## Adding Covariates: Accelerated Failure Time Model The accelerated failure time (AFT) model links covariates to the scale parameter: $$\log(\lambda_i) = \beta_0 + \beta_1 x_{i1} + \ldots + \beta_p x_{ip}$$ This is straightforward to build from our primitives: ```{r} set.seed(42) n <- 200 # Covariate: treatment group (0 = control, 1 = treatment) treatment <- rbinom(n, 1, 0.5) # True model: treatment increases scale (longer survival) true_scale <- exp(2 + 0.5 * treatment) # beta0=2, beta1=0.5 true_shape <- 1.5 # Simulate survival times true_times <- rweibull(n, shape = true_shape, scale = true_scale) censor_times <- runif(n, min = 5, max = 30) obs_times2 <- pmin(true_times, censor_times) status2 <- as.integer(true_times <= censor_times) cat("Events:", sum(status2), "/ Censored:", sum(1 - status2), "\n") ``` ```{r} # AFT Weibull with treatment covariate aft_loglik <- function(log_k, beta0, beta1, times, status, x) { k <- exp(log_k) log_times <- log(times) # Per-subject log-scale: log(lambda_i) = beta0 + beta1 * x_i n_obs <- length(times) ll <- k * 0 # zero of the correct type for (i in seq_len(n_obs)) { log_lambda_i <- beta0 + beta1 * x[i] # log(t/lambda) = log(t) - log(lambda) log_ratio <- log_times[i] - log_lambda_i # Survival contribution: -(t/lambda)^k = -exp(k * log(t/lambda)) ll <- ll - exp(k * log_ratio) if (status[i] == 1) { # Event contribution: log(k) - log(lambda) + (k-1)*log(t/lambda) ll <- ll + log(k) - log_lambda_i + (k - 1) * log_ratio } } ll } aft_fit <- fit( function(log_k, beta0, beta1) { aft_loglik(log_k, beta0, beta1, obs_times2, status2, treatment) }, params = c(log_k = 0, beta0 = 1, beta1 = 0) ) summary(aft_fit) ``` ### Interpreting the Treatment Effect ```{r} # Shape parameter cat("Shape (k):", exp(coef(aft_fit)["log_k"]), "\n") # Treatment effect: exp(beta1) is the scale ratio cat("Scale ratio (treatment/control):", exp(coef(aft_fit)["beta1"]), "\n") # 95% CI for treatment effect beta1_ci <- confint(aft_fit, parm = "beta1") cat("95% CI for beta1:", beta1_ci, "\n") cat("95% CI for scale ratio:", exp(beta1_ci), "\n") ``` ### Is Treatment Significant? ```{r} # Wald test wald_test(aft_fit, parm = "beta1", null_value = 0) ``` ## Summary This vignette demonstrated femtograd's compositional approach: 1. **Built-in primitives** (`loglik_weibull`, `exp`, `log`) handle the uncensored case directly 2. **Custom log-likelihoods** are straightforward to write---just compose operations on `value` objects 3. **Parameter transforms** (`exp(log_k)`) handle constraints naturally 4. **Inference tools** (`summary`, `confint`, `wald_test`, `anova`) work identically on custom models 5. **Covariate models** (AFT) are built by linking parameters to covariates inside the likelihood The same approach extends to other survival models (log-normal, log-logistic, piecewise exponential) and other statistical domains---define a log-likelihood from femtograd's differentiable operations, and the full inference pipeline follows automatically.