--- title: "Statistical Inference with femtograd" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Statistical Inference with femtograd} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(femtograd) ``` ## Overview After fitting a model with `fit()`, femtograd provides a full suite of inference tools: - **Standard errors** from the observed Fisher information - **Wald confidence intervals** and hypothesis tests - **Likelihood ratio tests** for nested model comparison - **Profile likelihood** confidence intervals - **Bootstrap inference** for non-asymptotic settings - **Model diagnostics** to verify inference reliability This vignette demonstrates each approach using a running example. ## Running Example We'll fit a gamma distribution to waiting time data: ```{r} set.seed(42) waiting <- rgamma(150, shape = 3, rate = 0.5) # Fit gamma model (both parameters positive, so use log-transform) gamma_fit <- fit( function(log_shape, log_rate) { shape <- exp(log_shape) rate <- exp(log_rate) loglik_gamma(shape, rate, waiting) }, params = c(log_shape = 0, log_rate = 0) ) summary(gamma_fit) ``` The estimates are on the log scale. To interpret, exponentiate: ```{r} exp(coef(gamma_fit)) ``` ## Wald Inference ### Standard Errors Standard errors come from the inverse of the observed Fisher information matrix (the negative Hessian of the log-likelihood at the MLE): $$\text{SE}(\hat{\theta}_j) = \sqrt{[\mathcal{I}(\hat{\theta})^{-1}]_{jj}}$$ ```{r} se(gamma_fit) ``` ### Confidence Intervals Wald confidence intervals assume asymptotic normality of the MLE: $$\hat{\theta}_j \pm z_{\alpha/2} \cdot \text{SE}(\hat{\theta}_j)$$ ```{r} # 95% CIs (default) confint(gamma_fit) # 99% CIs confint(gamma_fit, level = 0.99) ``` These are CIs for the log-transformed parameters. For CIs on the original scale, exponentiate the endpoints (this works because exp is monotone): ```{r} exp(confint(gamma_fit)) ``` ### Wald Test Test whether a parameter equals a hypothesized value: ```{r} # Test H0: log_shape = 0 (i.e., shape = 1, which is exponential) wt <- wald_test(gamma_fit, parm = "log_shape", null_value = 0) wt ``` Access individual components: ```{r} test_stat(wt) pval(wt) is_significant_at(wt, alpha = 0.05) ``` ## Likelihood Ratio Tests For comparing nested models, the likelihood ratio test (LRT) is often more powerful than the Wald test. The test statistic is: $$\Lambda = -2[\ell(\hat{\theta}_0) - \ell(\hat{\theta}_1)] \sim \chi^2_{df}$$ where $df$ is the difference in number of parameters. ```{r} # Null model: exponential (shape = 1, equivalent to log_shape = 0) exp_fit <- fit( function(log_rate) { rate <- exp(log_rate) loglik_exponential(rate, waiting) }, params = c(log_rate = 0) ) # LRT: Is gamma significantly better than exponential? lr <- lrt(exp_fit, gamma_fit) lr ``` The `anova()` function provides the same test in a familiar format: ```{r} anova(exp_fit, gamma_fit) ``` ## Model Comparison When models are not nested, use information criteria instead: ```{r} # Compare gamma vs exponential compare(exp_fit, gamma_fit, names = c("Exponential", "Gamma")) ``` The `dAIC` column shows how much worse each model is than the best. AIC weights give the relative probability that each model is best (assuming one of them is true). ## Profile Likelihood Profile likelihood provides confidence intervals that don't rely on the normality assumption. They are based on the likelihood ratio: $$\{\ \theta_j : 2[\ell(\hat{\theta}) - \ell_P(\theta_j)] \leq \chi^2_{1, \alpha} \}$$ ```{r} # Profile likelihood for log_shape prof <- profile_loglik(gamma_fit, parm = "log_shape") prof # Profile-based confidence intervals suppressWarnings(confint_profile(gamma_fit)) ``` Profile CIs are generally preferred over Wald CIs for small samples or when the likelihood surface is asymmetric. ## Bootstrap Inference For situations where asymptotic theory may not hold (small samples, boundary parameters, complex models), bootstrap inference provides a nonparametric alternative. The `bootstrap_fit()` function uses the `loglik_maker` pattern: a function that takes data and returns a log-likelihood function. ```{r} boot <- bootstrap_fit( loglik_maker = function(d) { function(log_shape, log_rate) { shape <- exp(log_shape) rate <- exp(log_rate) loglik_gamma(shape, rate, d) } }, data = waiting, params = c(log_shape = 0, log_rate = 0), n_boot = 200, progress = FALSE ) boot ``` Bootstrap confidence intervals come in three flavors: ```{r} # Percentile method (most common) confint(boot, type = "percentile") # Basic bootstrap confint(boot, type = "basic") # Normal approximation (uses bootstrap SE) confint(boot, type = "normal") ``` ## Model Diagnostics Before trusting inference results, verify that the model is well-behaved: ```{r} # Check Hessian properties check_hessian(gamma_fit) # Check optimization convergence check_convergence(gamma_fit) # Full diagnostics diagnostics(gamma_fit) ``` The quick check: ```{r} # Are standard errors reliable? se_reliable(gamma_fit) ``` ### What Can Go Wrong - **Non-positive-definite Hessian**: The MLE may be at a saddle point, or the model may be unidentifiable - **Large condition number**: Parameters are highly correlated, SEs may be unstable - **Non-zero gradient**: Optimization may not have converged ## Inference Summary | Method | When to use | Assumptions | |--------|------------|-------------| | Wald CI/test | Large samples, well-behaved likelihood | Asymptotic normality | | Likelihood ratio test | Nested models, moderate samples | Regularity conditions | | Profile likelihood | Asymmetric likelihood, moderate samples | Smooth likelihood | | Bootstrap | Small samples, complex models | Resampling validity | For routine analysis with moderate-to-large samples, Wald inference (the default from `confint()` and `summary()`) is usually sufficient. Use profile likelihood or bootstrap when you suspect the asymptotic approximation may be poor.