| Title: | Propensity Score Methods for Survival Analysis |
|---|---|
| Description: | Implements propensity score weighting methods for estimating counterfactual survival functions, marginal hazard ratios, and weighted Kaplan-Meier and cumulative risk curves in observational studies with time-to-event outcomes. Supports binary and multiple treatment groups with inverse probability of treatment weighting (IPW), overlap weighting (OW), and average treatment effect on the treated (ATT). Includes symmetric trimming (Crump extension) for extreme propensity scores. Variance estimation via analytical M-estimation or bootstrap. Methods based on Li et al. (2018) <doi:10.1080/01621459.2016.1260466>, Li & Li (2019) <doi:10.1214/19-AOAS1282>, and Cheng et al. (2022) <doi:10.1093/aje/kwac043>. |
| Authors: | Chengxin Yang [aut, cre], Chao Cheng [aut], Fan Li [aut], Fan Li [aut] |
| Maintainer: | Chengxin Yang <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 0.2.0 |
| Built: | 2026-05-20 09:10:46 UTC |
| Source: | https://github.com/cxinyang/pssurvival |
Functions for estimating propensity scores for binary and multiple treatment groups. Estimate Propensity Scores
Fits a propensity score model and extracts propensity scores for binary or multiple treatment groups. For binary treatments, uses binomial logistic regression. For multiple treatments (>2 levels), uses multinomial logistic regression to estimate generalized propensity scores.
estimate_ps(data, treatment_var, ps_formula, ps_control = list())estimate_ps(data, treatment_var, ps_formula, ps_control = list())
data |
A data.frame containing the analysis data (typically the cleaned data with complete cases). |
treatment_var |
A character string specifying the name of the treatment
variable in |
ps_formula |
A formula object for the propensity score model, of the
form |
ps_control |
An optional list of control parameters to pass to the
model fitting function ( |
Propensity Score Definition: Returns P(Z = observed | X) for each individual, not P(Z=1|X) for all (as in Rosenbaum & Rubin 1983). This definition enables direct use in IPW and extends naturally to multiple treatments.
Binary Treatments (2 levels):
Fits binomial logistic regression via glm(). Treatment is factorized
with levels sorted by sort(): numerically for numeric, alphabetically
for character, by factor level order for factor. Returns P(Z = observed | X).
Multiple Treatments (>2 levels):
Fits multinomial logistic regression via nnet::multinom(). Returns
P(Z = observed | X) for each individual from the generalized PS matrix.
Control Parameters (ps_control):
Binary: glm.control() parameters (default: epsilon=1e-08, maxit=25)
Multiple: multinom() parameters (default: MaxNWts=10000, maxit=100, trace=FALSE)
A list with the following components:
ps_model |
The fitted propensity score model object (class |
ps |
A numeric vector of propensity scores representing the probability
of receiving the actual treatment each individual received. Length equals
the number of rows in |
ps_matrix |
A numeric matrix of dimension n × K where n is the number of observations and K is the number of treatment levels. Each row contains the predicted probabilities for all treatment levels. Column names correspond to treatment levels. |
n_levels |
An integer indicating the number of treatment levels. |
treatment_levels |
A vector of unique treatment values sorted by
|
# Example 1: Binary treatment data(simdata_bin) ps_bin <- estimate_ps( data = simdata_bin, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2 ) summary(ps_bin$ps) table(simdata_bin$Z) # Example 2: Multiple treatments data(simdata_multi) ps_multi <- estimate_ps( data = simdata_multi, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2 ) head(ps_multi$ps_matrix)# Example 1: Binary treatment data(simdata_bin) ps_bin <- estimate_ps( data = simdata_bin, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2 ) summary(ps_bin$ps) table(simdata_bin$Z) # Example 2: Multiple treatments data(simdata_multi) ps_multi <- estimate_ps( data = simdata_multi, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2 ) head(ps_multi$ps_matrix)
Calculates propensity score weights for causal inference with optional trimming. Supports ATE, ATT, and overlap population estimands for binary and multiple treatment groups.
estimate_weights( ps_result, data, treatment_var, estimand = "ATE", att_group = NULL, trim = NULL, delta = NULL, alpha = NULL )estimate_weights( ps_result, data, treatment_var, estimand = "ATE", att_group = NULL, trim = NULL, delta = NULL, alpha = NULL )
ps_result |
A list returned by |
data |
A data.frame containing the treatment variable (same data used
in |
treatment_var |
A character string specifying the name of the treatment
variable in |
estimand |
Character string specifying the target population. One of:
|
att_group |
For ATT estimation, specifies which treatment group to target. This is MANDATORY when estimand = "ATT". Ignored for other estimands. |
trim |
Character string specifying the trimming method, or NULL for no trimming (default). Options: "symmetric" (Crump extension) or "asymmetric" (Sturmer extension). Trimming is NOT supported with overlap estimand. |
delta |
Trimming threshold for symmetric trimming in (0, 1/J], where J is the number of treatment levels. If NULL (default), uses recommended values from Yoshida et al. (2019). Ignored unless trim = "symmetric". |
alpha |
Percentile threshold for asymmetric trimming in (0, 0.5). If NULL (default), uses recommended values from Yoshida et al. (2019). Ignored unless trim = "asymmetric". |
Trimming Workflow: When trimming is requested, the function: (1) identifies observations to trim using PS from full data, (2) re-estimates PS on trimmed data, (3) calculates weights from re-estimated PS. This ensures trimming uses the original covariate distribution while weights reflect the overlapping population.
Overlap weights do not support trimming (already bounded in [0,1]).
A list containing:
weights |
Numeric vector of weights (length = nrow(data)). |
trim_summary |
Data frame with trimming summary by treatment group. |
ess |
Named numeric vector of effective sample size by treatment group. |
method |
Character string: "IPW" for ATE/ATT, "overlap" for overlap. |
estimand |
Character string of estimand used. |
att_group |
Target group for ATT (NULL if not applicable). |
trim_method |
Character string of trimming method (NULL if no trimming). |
delta |
Numeric trimming threshold used for symmetric trimming (NULL if not applicable). |
alpha |
Numeric percentile threshold used for asymmetric trimming (NULL if not applicable). |
n_levels |
Number of treatment levels. |
treatment_levels |
Vector of treatment level values. |
ps_result |
PS result object (refitted after trimming if trimming was applied). |
Li, F., & Li, F. (2019). Propensity score weighting for causal inference with multiple treatments. The Annals of Applied Statistics, 13(4), 2389-2415.
Yoshida, K., et al. (2019). Multinomial extension of propensity score trimming methods: A simulation study. American Journal of Epidemiology, 188(3), 609-616.
Crump, R. K., et al. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96(1), 187-199.
# Example 1: Overlap weighting for binary treatment data(simdata_bin) ps_bin <- estimate_ps( data = simdata_bin, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2 ) weights_ow <- estimate_weights( ps_result = ps_bin, data = simdata_bin, treatment_var = "Z", estimand = "overlap" ) summary(weights_ow$weights) # Example 2: ATT with multiple treatments data(simdata_multi) ps_multi <- estimate_ps( data = simdata_multi, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2 ) weights_att <- estimate_weights( ps_result = ps_multi, data = simdata_multi, treatment_var = "Z", estimand = "ATT", att_group = "C" ) summary(weights_att$weights)# Example 1: Overlap weighting for binary treatment data(simdata_bin) ps_bin <- estimate_ps( data = simdata_bin, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2 ) weights_ow <- estimate_weights( ps_result = ps_bin, data = simdata_bin, treatment_var = "Z", estimand = "overlap" ) summary(weights_ow$weights) # Example 2: ATT with multiple treatments data(simdata_multi) ps_multi <- estimate_ps( data = simdata_multi, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2 ) weights_att <- estimate_weights( ps_result = ps_multi, data = simdata_multi, treatment_var = "Z", estimand = "ATT", att_group = "C" ) summary(weights_att$weights)
Main user interface for estimating marginal hazard ratios using propensity score weighting. Supports binary and multiple treatment groups with various weighting schemes (IPW, OW, or ATT) and optional trimming. Variance can be estimated via bootstrap or robust sandwich estimator.
marCoxph( data, ps_formula, time_var, event_var, reference_level, weight_method = "IPW", att_group = NULL, trim = FALSE, delta = NULL, variance_method = "bootstrap", boot_level = "full", B = 100, parallel = FALSE, mc.cores = 2, seed = NULL, ps_control = list(), robust = TRUE )marCoxph( data, ps_formula, time_var, event_var, reference_level, weight_method = "IPW", att_group = NULL, trim = FALSE, delta = NULL, variance_method = "bootstrap", boot_level = "full", B = 100, parallel = FALSE, mc.cores = 2, seed = NULL, ps_control = list(), robust = TRUE )
data |
Data frame containing treatment, survival outcome, and covariates. |
ps_formula |
Formula for propensity score model: |
time_var |
Character string specifying the time-to-event variable name. |
event_var |
Character string specifying the event indicator variable name. Should be coded as 1=event, 0=censored. |
reference_level |
Treatment level to use as reference in Cox model. MANDATORY. Must be one of the treatment levels. |
weight_method |
Weighting method: "IPW" (inverse probability weighting), "OW" (overlap weighting), or "ATT" (average treatment effect on the treated). Default "IPW". |
att_group |
Target group for ATT. Required if |
trim |
Logical. Perform symmetric propensity score trimming? Default FALSE.
If TRUE, symmetric trimming is applied (Crump extension for multiple treatments).
See |
delta |
Threshold for symmetric trimming in |
variance_method |
Variance estimation method: "bootstrap" (default) or "robust".
"bootstrap" resamples the entire analysis pipeline. "robust" uses the sandwich
variance estimator from |
boot_level |
Bootstrap sampling level: "full" (default) or "strata".
"full" resamples from entire dataset (standard for observational studies). "strata"
resamples within each treatment group preserving group sizes (useful when treatment assignment
follows a stratified or fixed-ratio design). Only used if |
B |
Number of bootstrap iterations. Default 100. Used only if |
parallel |
Logical. Use parallel bootstrap computation? Default FALSE. |
mc.cores |
Number of cores for parallel bootstrap. Default 2. |
seed |
Random seed for bootstrap reproducibility. Default NULL. |
ps_control |
Control parameters for propensity score model. Default |
robust |
Logical. Use robust (sandwich) variance in Cox model fitting?
Default TRUE. When TRUE, |
**Weighting Methods:**
The weight_method parameter specifies the target population for causal
inference:
IPW (Inverse Probability Weighting): Observations are weighted
by the inverse probability of their observed treatment,
where j is the observed treatment group. Inference targets the combined
population (ATE type).
OW (Overlap Weighting): Observations are weighted by overlap weights, which extends to multiple treatment groups (Li et al., 2018; Li and Li, 2019). Inference targets the population at clinical equipoise (overlap population).
ATT (Average Treatment Effect on the Treated): IPW weights
tilted toward a specified target group. Observations in the target group
receive weight 1, others receive .
Inference targets the specified treatment group population (ATT type).
**Analysis Workflow:**
1. Extract treatment variable from ps_formula.
2. Estimate propensity scores using multinomial logistic regression (or logistic
for binary treatment).
3. Calculate propensity score weights based on weight_method and optional trim.
4. Fit marginal Cox model Surv(time, event) ~ treatment with weights.
5. Estimate variance via bootstrap (resampling full pipeline) or robust sandwich
estimator.
**Variance Estimation:**
- bootstrap: Resamples data (full or stratified), re-estimates PS and weights,
re-fits Cox model. Provides bootstrap SE for log hazard ratios.
- robust: Uses robust sandwich variance from coxph() directly. No
bootstrap performed (faster but may be less accurate with extreme weights).
Object of class "marCoxph" containing:
coxph_fitted |
Fitted |
logHR_est |
Named vector of estimated log hazard ratios. Names are formatted as "treatment_var:level" (e.g., "Z:B" for treatment Z, level B vs reference). |
logHR_se_robust |
Named vector of robust standard errors from |
logHR_se_bootstrap |
Named vector of bootstrap standard errors. NULL if
|
n_coxph_fitted |
Named vector of sample sizes per treatment group used in Cox model fitting (after trimming). |
events_coxph_fitted |
Named vector of event counts per treatment group used in Cox model fitting (after trimming). |
variance_method |
Variance method used: "bootstrap-full", "bootstrap-strata", or "robust". |
estimand |
Target estimand used. |
att_group |
Target group for ATT (NULL if not applicable). |
trim_method |
Trimming method (NULL if no trimming). |
delta |
Symmetric trimming threshold (NULL if not applicable). |
alpha |
Asymmetric trimming threshold (NULL if not applicable). |
treatment_var |
Name of treatment variable. |
treatment_levels |
Sorted unique treatment values. |
reference_level |
Reference level used in Cox model. |
n_levels |
Number of treatment groups. |
n |
Number of complete cases used in analysis. |
ps_result |
Propensity score estimation results. |
weight_result |
Weight estimation results. |
boot_result |
Bootstrap results (NULL if |
Li, F., Morgan, K. L., & Zaslavsky, A. M. (2018). Balancing covariates via propensity score weighting. Journal of the American Statistical Association, 113(521), 390-400.
Li, F., & Li, F. (2019). Propensity score weighting for causal inference with multiple treatments. The Annals of Applied Statistics, 13(4), 2389-2415.
Yoshida, K., et al. (2019). Multinomial extension of propensity score trimming methods: A simulation study. American Journal of Epidemiology, 188(3), 609-616.
# Example 1: Binary treatment with overlap weighting data(simdata_bin) result1 <- marCoxph( data = simdata_bin, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", reference_level = "A", weight_method = "OW" ) summary(result1) # Example 2: Multiple treatments with ATT and robust variance data(simdata_multi) result2 <- marCoxph( data = simdata_multi, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", reference_level = "C", weight_method = "ATT", att_group = "C", variance_method = "robust" ) summary(result2)# Example 1: Binary treatment with overlap weighting data(simdata_bin) result1 <- marCoxph( data = simdata_bin, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", reference_level = "A", weight_method = "OW" ) summary(result1) # Example 2: Multiple treatments with ATT and robust variance data(simdata_multi) result2 <- marCoxph( data = simdata_multi, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", reference_level = "C", weight_method = "ATT", att_group = "C", variance_method = "robust" ) summary(result2)
Plot Method for surveff Objects
## S3 method for class 'surveff' plot( x, type = "surv", max_time = NULL, strata_to_plot = NULL, strata_colors = NULL, conf_level = 0.95, include_CI = TRUE, curve_width = 1, CI_alpha = 0.3, legend_position = "right", legend_title = NULL, plot_title = NULL, ... )## S3 method for class 'surveff' plot( x, type = "surv", max_time = NULL, strata_to_plot = NULL, strata_colors = NULL, conf_level = 0.95, include_CI = TRUE, curve_width = 1, CI_alpha = 0.3, legend_position = "right", legend_title = NULL, plot_title = NULL, ... )
x |
A |
type |
Type of plot: "surv" for survival curves or "survdiff" for treatment effect curves. Default "surv". |
max_time |
Maximum time to display on x-axis. If NULL, uses max(eval_times). |
strata_to_plot |
Vector of strata to plot. For |
strata_colors |
Vector of color names/codes for strata. Length must match strata_to_plot. Order matches strata order. If NULL, uses ggplot2 default colors. |
conf_level |
Confidence level for confidence intervals. Default 0.95. |
include_CI |
Logical. Include confidence interval ribbons? Default TRUE. |
curve_width |
Line width for survival/difference curves. Default 1. |
CI_alpha |
Transparency level for CI ribbons (0-1). Default 0.3. |
legend_position |
Position of legend: "right" or "bottom". Default "right". |
legend_title |
Title for legend. If NULL, uses "Treatment" for type="surv" or "Comparison" for type="survdiff". |
plot_title |
Plot title. If NULL, uses default title based on type. |
... |
Additional arguments (ignored). |
Creates publication-ready plots of survival curves or treatment effects over time.
For type = "surv": Plots estimated survival functions with optional
confidence intervals. Y-axis ranges from 0 to 1.
For type = "survdiff": Plots estimated treatment effects (survival differences)
with optional confidence intervals. Y-axis is not constrained to [0,1].
A ggplot2 object.
Plot Method for Weighted Kaplan-Meier Estimates
## S3 method for class 'weightedKM' plot( x, type = "Kaplan-Meier", include_CI = TRUE, conf_type = "log-log", conf_level = 0.95, max_time = NULL, strata_to_plot = NULL, strata_colors = NULL, curve_width = 1, CI_alpha = 0.3, legend_position = "right", legend_title = NULL, plot_title = NULL, xlab = "Time", ylab = NULL, xlim = NULL, ylim = NULL, risk_table = FALSE, risk_table_stats = c("n.risk", "n.acc.event"), risk_table_height = 0.25, risk_table_breaks = NULL, risk_table_fontsize = 3.5, ... )## S3 method for class 'weightedKM' plot( x, type = "Kaplan-Meier", include_CI = TRUE, conf_type = "log-log", conf_level = 0.95, max_time = NULL, strata_to_plot = NULL, strata_colors = NULL, curve_width = 1, CI_alpha = 0.3, legend_position = "right", legend_title = NULL, plot_title = NULL, xlab = "Time", ylab = NULL, xlim = NULL, ylim = NULL, risk_table = FALSE, risk_table_stats = c("n.risk", "n.acc.event"), risk_table_height = 0.25, risk_table_breaks = NULL, risk_table_fontsize = 3.5, ... )
x |
An object of class "weightedKM" from |
type |
Type of curve to plot: "Kaplan-Meier" (survival probabilities, default) or "CR" (cumulative risk, aka. cumulative incidence = 1 - survival). |
include_CI |
Logical. Include confidence interval ribbons? Default TRUE. |
conf_type |
Type of confidence interval: "plain", "log", or "log-log" (default). See Details. |
conf_level |
Confidence level for intervals. Default 0.95. |
max_time |
Maximum time to display on x-axis. Default is maximum observed event time. |
strata_to_plot |
Character vector of treatment levels to plot. Default plots all groups. |
strata_colors |
Character vector of colors for each stratum in
|
curve_width |
Line width for survival curves. Default 1. |
CI_alpha |
Transparency level for confidence interval ribbons (0-1). Default 0.3. |
legend_position |
Position of legend: "right" or "bottom". Default "right". |
legend_title |
Title for legend. Default "Treatment". |
plot_title |
Main plot title. Default depends on |
xlab |
X-axis label. Default "Time". |
ylab |
Y-axis label. Default depends on |
xlim |
Numeric vector of length 2 specifying x-axis limits. Default |
ylim |
Numeric vector of length 2 specifying y-axis limits. Default |
risk_table |
Logical. Add risk table below the plot? Default FALSE.
Requires |
risk_table_stats |
Character vector specifying statistics to display in risk table.
Options: "n.risk" (number at risk), "n.acc.event" (cumulative events).
Default |
risk_table_height |
Numeric. Relative height of risk table (0-1). Default 0.25. |
risk_table_breaks |
Numeric vector of time points for risk table columns.
If NULL (default), automatically determined based on |
risk_table_fontsize |
Numeric. Font size for risk table text. Default 3.5. |
... |
Additional arguments (currently unused). |
When type = "CR", the function plots representing the
probability of experiencing the event by time t. Variance is the same as
for survival, but confidence intervals are calculated on the CR scale.
A ggplot2 object if risk_table = FALSE, or a combined plot
(cowplot object) if risk_table = TRUE.
Print Method for marCoxph Objects
## S3 method for class 'marCoxph' print(x, max.len = 10, round.digits = 4, ...)## S3 method for class 'marCoxph' print(x, max.len = 10, round.digits = 4, ...)
x |
A |
max.len |
Maximum number of treatment comparisons to print. Default 10. |
round.digits |
Number of digits for rounding displayed values. Default 4. |
... |
Additional arguments (ignored). |
Invisibly returns the input object x.
Print Method for surveff Objects
## S3 method for class 'surveff' print(x, max.len = 6, round.digits = 4, ...)## S3 method for class 'surveff' print(x, max.len = 6, round.digits = 4, ...)
x |
A |
max.len |
Maximum number of rows (time points) to print. Default 6. |
round.digits |
Number of digits for rounding displayed values. Default 4. |
... |
Additional arguments (ignored). |
Invisibly returns the input object x.
Prints a summary of weighted Kaplan-Meier survival estimates.
## S3 method for class 'weightedKM' print(x, print.digits = 3, print.rows = 10, ...)## S3 method for class 'weightedKM' print(x, print.digits = 3, print.rows = 10, ...)
x |
An object of class "weightedKM" from |
print.digits |
Number of decimal places for printed output. Default 3. |
print.rows |
Number of rows to print for each treatment group. Default 10. |
... |
Additional arguments (currently unused). |
Invisibly returns the input object x.
data(simdata_bin) result <- weightedKM( data = simdata_bin, treatment_var = "Z", time_var = "time", event_var = "event", weight_method = "none" ) print(result)data(simdata_bin) result <- weightedKM( data = simdata_bin, treatment_var = "Z", time_var = "time", event_var = "event", weight_method = "none" ) print(result)
A simulated dataset for demonstrating propensity score weighting methods in survival analysis with a binary treatment.
simdata_binsimdata_bin
A data frame with 1000 observations and 8 variables:
Continuous covariate (standard normal).
Continuous covariate (standard normal).
Continuous covariate (standard normal).
Binary covariate (0/1).
Binary covariate (0/1).
Treatment group: "A" or "B". Distribution is approximately 40:60.
Observed follow-up time (event or censoring), range 0-20.
Event indicator: 1 = event observed, 0 = censored.
The data were generated with the following characteristics:
Treatment assignment depends on X1, X2, and B1 via logistic model.
Survival times follow Weibull distributions with group-specific scales (group A has better survival than group B).
Censoring times follow an exponential distribution depending on X1 and B1.
Administrative censoring occurs at time 20.
Overall censoring rate is approximately 30
simdata_multi for a dataset with 4 treatment groups.
data(simdata_bin) head(simdata_bin) table(simdata_bin$Z)data(simdata_bin) head(simdata_bin) table(simdata_bin$Z)
A simulated dataset for demonstrating propensity score weighting methods in survival analysis with four treatment groups.
simdata_multisimdata_multi
A data frame with 1000 observations and 8 variables:
Continuous covariate (standard normal).
Continuous covariate (standard normal).
Continuous covariate (standard normal).
Binary covariate (0/1).
Binary covariate (0/1).
Treatment group: "A", "B", "C", or "D". Distribution is approximately 20:20:20:35.
Observed follow-up time (event or censoring), range 0-20.
Event indicator: 1 = event observed, 0 = censored.
The data were generated with the following characteristics:
Treatment assignment depends on X1, X2, X3, B1, and B2 via multinomial logistic model.
Survival times follow Weibull distributions with group-specific scales. Survival ordering (best to worst): C > A > B > D.
Censoring times follow an exponential distribution depending on X1 and B1.
Administrative censoring occurs at time 20.
Overall censoring rate is approximately 30
simdata_bin for a dataset with binary treatment.
data(simdata_multi) head(simdata_multi) table(simdata_multi$Z)data(simdata_multi) head(simdata_multi) table(simdata_multi$Z)
Summary Method for marCoxph Objects
## S3 method for class 'marCoxph' summary(object, conf_level = 0.95, round.digits = 4, style = "prints", ...)## S3 method for class 'marCoxph' summary(object, conf_level = 0.95, round.digits = 4, style = "prints", ...)
object |
A |
conf_level |
Confidence level for intervals. Default 0.95. |
round.digits |
Number of digits for rounding displayed values. Default 4.
Only used if |
style |
Output style: "prints" (print formatted tables) or "returns" (return vectors). Default "prints". |
... |
Additional arguments (ignored). |
Confidence intervals are Wald-type intervals calculated as:
Log scale: logHR ± z_crit * SE
Original scale: exp(logHR ± z_crit * SE)
The SE used depends on variance_method from the original marCoxph call:
"robust": Uses logHR_se_robust from sandwich estimator.
"bootstrap-full" or "bootstrap-strata": Uses logHR_se_bootstrap.
If style = "prints", returns invisibly. If style = "returns",
returns a list with:
logHR |
Named vector of log hazard ratio estimates. |
logHR_CI_lower |
Named vector of lower CI bounds on log scale. |
logHR_CI_upper |
Named vector of upper CI bounds on log scale. |
SE |
Named vector of standard errors on log scale (from variance_method). |
HR |
Named vector of hazard ratio estimates (original scale). |
HR_CI_lower |
Named vector of lower CI bounds on original scale. |
HR_CI_upper |
Named vector of upper CI bounds on original scale. |
variance_method |
Variance method used. |
conf_level |
Confidence level used. |
n_per_group |
Named vector of sample sizes per group in Cox model. |
events_per_group |
Named vector of event counts per group in Cox model. |
Summary Method for surveff Objects
## S3 method for class 'surveff' summary( object, conf_level = 0.95, max.len = 6, round.digits = 4, style = "prints", ... )## S3 method for class 'surveff' summary( object, conf_level = 0.95, max.len = 6, round.digits = 4, style = "prints", ... )
object |
A |
conf_level |
Confidence level for intervals. Default 0.95. |
max.len |
Maximum number of rows (time points) to print. Default 6.
Only used if |
round.digits |
Number of digits for rounding displayed values. Default 4.
Only used if |
style |
Output style: "prints" (print formatted tables) or "returns" (return list of matrices). Default "prints". |
... |
Additional arguments (ignored). |
If style = "prints", returns invisibly. If style = "returns",
returns a list with:
survival_summary |
List of matrices, one per treatment group, with columns: Time, Estimate, SE, CI.lower, CI.upper |
difference_summary |
List of matrices, one per contrast, with same columns. NULL if no contrasts estimated. |
Generates summary tables of weighted Kaplan-Meier survival or cumulative risk estimates with confidence intervals for each treatment group.
## S3 method for class 'weightedKM' summary( object, type = "Kaplan-Meier", conf_type = "log-log", conf_level = 0.95, print.digits = 3, print.rows = 10, ... )## S3 method for class 'weightedKM' summary( object, type = "Kaplan-Meier", conf_type = "log-log", conf_level = 0.95, print.digits = 3, print.rows = 10, ... )
object |
An object of class "weightedKM" from |
type |
Type of estimate to summarize: "Kaplan-Meier" (survival probabilities, default) or "CR" (cumulative risk, aka. cumulative incidence = 1 - survival). |
conf_type |
Type of confidence interval: "plain", "log", or
"log-log" (default). See |
conf_level |
Confidence level for intervals. Default 0.95. |
print.digits |
Number of decimal places for printed output. Default 3. |
print.rows |
Number of rows to print for each treatment group. Default 10. |
... |
Additional arguments (currently unused). |
This method provides tabular summaries of weighted Kaplan-Meier estimates with
confidence intervals. It uses the same CI calculation methods as
plot.weightedKM().
When type = "CR", the function transforms survival estimates to
cumulative risk and calculates confidence intervals on that scale.
The returned list contains full-precision matrices that can be used for further analysis. The printed output is rounded for readability.
A list with one element per treatment group. Each element is a matrix with columns:
time: Evaluation time points
estimate: Survival probability or cumulative risk
se: Standard error
CI_lower: Lower confidence bound
CI_upper: Upper confidence bound
The list is returned invisibly with full precision. Printed output is
rounded to print.digits decimal places and shows first print.rows
rows per group.
Main user interface for estimating counterfactual survival functions and treatment effects using propensity score weighting and inverse probability of censoring weighting. Supports binary and multiple treatment groups with various weighting schemes (IPW, OW, or ATT) and optional trimming.
surveff( data, ps_formula, censoring_formula, eval_times = NULL, weight_method = "IPW", att_group = NULL, trim = FALSE, delta = NULL, contrast_matrix = NULL, censoring_method = "weibull", variance_method = NULL, B = 100, parallel = FALSE, mc.cores = 2, seed = NULL, censoring_control = NULL, ties = "efron", ps_control = list(), boot_level = "full" )surveff( data, ps_formula, censoring_formula, eval_times = NULL, weight_method = "IPW", att_group = NULL, trim = FALSE, delta = NULL, contrast_matrix = NULL, censoring_method = "weibull", variance_method = NULL, B = 100, parallel = FALSE, mc.cores = 2, seed = NULL, censoring_control = NULL, ties = "efron", ps_control = list(), boot_level = "full" )
data |
Data frame containing treatment, outcome, and covariates. |
ps_formula |
Formula for propensity score model: |
censoring_formula |
Formula for censoring model: |
eval_times |
Numeric vector of time points for evaluation. If NULL (default), uses all unique event times. |
weight_method |
Weighting method: "IPW" (inverse probability weighting), "OW" (overlap weighting), or "ATT" (average treatment effect on the treated). Default "IPW". |
att_group |
Target group for ATT. Required if |
trim |
Logical. Perform symmetric propensity score trimming? Default FALSE.
If TRUE, symmetric trimming is applied (Crump extension for multiple treatments).
See |
delta |
Threshold for symmetric trimming in |
contrast_matrix |
Optional matrix for treatment differences in multiple group settings. Each row defines one contrast with exactly two non-zero elements: -1 and 1. Column names must match treatment levels. For binary treatment, always estimates second level minus first level (S1 - S0), ignoring this parameter. |
censoring_method |
Method for censoring score estimation: "weibull" or "cox". Default "weibull". |
variance_method |
Variance estimation method: "analytical" (binary treatment with Weibull censoring only) or "bootstrap". Default "analytical" for binary Weibull, "bootstrap" otherwise. Cox censoring always uses bootstrap. |
B |
Number of bootstrap iterations. Default 100. Used only if |
parallel |
Logical. Use parallel bootstrap computation? Default FALSE. |
mc.cores |
Number of cores for parallel bootstrap. Default 2. |
seed |
Random seed for bootstrap reproducibility. Default NULL. |
censoring_control |
Control parameters passed to censoring model fitting function.
For Weibull: passed to |
ties |
Tie handling method for Cox models. Default "efron". Ignored for Weibull. |
ps_control |
Control parameters for propensity score model. Default |
boot_level |
Bootstrap sampling level: "full" (default) or "strata".
"full" resamples from entire dataset (standard for observational studies). "strata"
resamples within each treatment group preserving group sizes (useful when treatment assignment
follows a stratified or fixed-ratio design). Only used if |
**Weighting Methods:**
The weight_method parameter specifies the target population for causal
inference:
IPW (Inverse Probability Weighting): Observations are weighted
by the inverse probability of their observed treatment,
where j is the observed treatment group. Inference targets the combined
population.
OW (Overlap Weighting): Observations are weighted by overlap weights, which extends to multiple treatment groups (Li et al., 2018; Li and Li, 2019). Inference targets the population at clinical equipoise (overlap population).
ATT (Average Treatment Effect on the Treated): IPW weights
tilted toward a specified target group. Observations in the target group
receive weight 1, others receive .
Inference targets the specified treatment group population.
**Variance Estimation:** - Analytical: Binary treatment with Weibull censoring only (M-estimation). - Bootstrap: All settings (resamples entire pipeline). - Cox: Always uses bootstrap.
**Treatment Effects:**
- Binary: S1 - S0 (second level minus first).
- Multiple groups: Requires contrast_matrix for pairwise comparisons.
List containing:
survival_estimates |
Matrix [time x J] of survival function estimates for each group. |
survival_se |
Matrix [time x J] of standard errors for survival functions. |
difference_estimates |
Matrix [time x K] of treatment effect estimates.
For binary treatment: single column with S1-S0. For multiple groups: contrasts
from |
difference_se |
Matrix [time x K] of standard errors for treatment effects. |
eval_times |
Time points evaluated. |
treatment_levels |
Sorted unique treatment values. |
n_levels |
Number of treatment groups. |
n |
Sample size (complete cases after data validation). |
included |
Logical vector [n] indicating inclusion in analysis. TRUE = included, FALSE = excluded due to trimming. |
estimand |
Estimand used. |
censoring_method |
Censoring method used. |
variance_method |
Variance method used. |
contrast_matrix |
Contrast matrix used (NULL if not applicable). |
ps_model |
Fitted propensity score model (glm or multinom object). |
censoring_models |
Named list of fitted censoring models by treatment group. |
weights |
Numeric vector [n] of final weights (0 for trimmed observations). |
trim_summary |
Data frame with trimming summary by treatment group. |
ess |
Named numeric vector of effective sample size by treatment group. |
boot_result |
Bootstrap results (NULL if analytical variance used). |
Li, F., Morgan, K. L., & Zaslavsky, A. M. (2018). Balancing covariates via propensity score weighting. Journal of the American Statistical Association, 113(521), 390-400.
Li, F., & Li, F. (2019). Propensity score weighting for causal inference with multiple treatments. The Annals of Applied Statistics, 13(4), 2389-2415.
Yoshida, K., et al. (2019). Multinomial extension of propensity score trimming methods: A simulation study. American Journal of Epidemiology, 188(3), 609-616.
Cheng, C., Li, F., Thomas, L. E., & Li, F. (2022). Addressing extreme propensity scores in estimating counterfactual survival functions via the overlap weights. American Journal of Epidemiology, 191(6), 1140-1151.
# Example 1: Binary treatment with overlap weighting and Weibull censoring model data(simdata_bin) result1 <- surveff( data = simdata_bin, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, censoring_formula = survival::Surv(time, event) ~ X1 + B1, weight_method = "OW", censoring_method = "weibull" ) summary(result1) plot(result1) # Example 2: Multiple treatments with IPW and Cox censoring model data(simdata_multi) result2 <- surveff( data = simdata_multi, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, censoring_formula = survival::Surv(time, event) ~ X1 + B1, weight_method = "IPW", censoring_method = "cox", variance_method = "bootstrap", B = 100 ) summary(result2)# Example 1: Binary treatment with overlap weighting and Weibull censoring model data(simdata_bin) result1 <- surveff( data = simdata_bin, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, censoring_formula = survival::Surv(time, event) ~ X1 + B1, weight_method = "OW", censoring_method = "weibull" ) summary(result1) plot(result1) # Example 2: Multiple treatments with IPW and Cox censoring model data(simdata_multi) result2 <- surveff( data = simdata_multi, ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, censoring_formula = survival::Surv(time, event) ~ X1 + B1, weight_method = "IPW", censoring_method = "cox", variance_method = "bootstrap", B = 100 ) summary(result2)
Computes weighted Kaplan-Meier survival or cumulative incidence curves using
propensity score weights. Supports multiple treatment groups with various
weighting schemes (IPW, OW, or ATT) and optional trimming. Special case
weight_method = "none" provides classical (unweighted) Kaplan-Meier.
weightedKM( data, treatment_var, time_var, event_var, ps_formula = NULL, weight_method = "IPW", att_group = NULL, trim = FALSE, delta = NULL, ps_control = list() )weightedKM( data, treatment_var, time_var, event_var, ps_formula = NULL, weight_method = "IPW", att_group = NULL, trim = FALSE, delta = NULL, ps_control = list() )
data |
Data frame containing treatment, survival outcome, and covariates. |
treatment_var |
Character string specifying the name of the treatment variable. |
time_var |
Character string specifying the time-to-event variable name. |
event_var |
Character string specifying the event indicator variable name. Should be coded as 1=event, 0=censored. |
ps_formula |
Formula for propensity score model: |
weight_method |
Weighting method: "IPW" (inverse probability weighting), "OW" (overlap weighting), "ATT" (average treatment effect on the treated), or "none" (unweighted). Default "IPW". |
att_group |
Target group for ATT. Required if |
trim |
Logical. To perform symmetric propensity score trimming? Default FALSE.
If TRUE, symmetric trimming is applied (Crump extension for multiple treatments).
See |
delta |
Threshold for symmetric trimming in |
ps_control |
Control parameters for propensity score model. Default |
**Weighting Methods:**
The weight_method parameter specifies the target population for causal
inference:
IPW (Inverse Probability Weighting): Observations are weighted
by the inverse probability of their observed treatment,
where j is the observed treatment group. Inference targets the combined
population.
OW (Overlap Weighting): Observations are weighted by overlap weights, which extends to multiple treatment groups (Li et al., 2018; Li and Li, 2019). Inference targets the population at clinical equipoise (overlap population).
ATT (Average Treatment Effect on the Treated): IPW weights
tilted toward a specified target group. Observations in the target group
receive weight 1, others receive .
Inference targets the specified treatment group population.
none: No weighting applied (all weights = 1). Produces classical unweighted Kaplan-Meier estimates stratified by treatment.
Object of class "weightedKM" containing:
eval_times |
Numeric vector of all unique event times. |
surv_estimates |
Matrix [n_times x n_groups] of survival probability estimates. |
surv_var |
Matrix [n_times x n_groups] of variances. |
n_risk |
Matrix [n_times x n_groups] of weighted number at risk. |
n_event |
Matrix [n_times x n_groups] of weighted number of events. |
n_acc_event |
Matrix [n_times x n_groups] of cumulative weighted events up to each time. |
treatment_levels |
Sorted unique treatment values. |
weight_method |
Weighting method used. |
att_group |
Target group for ATT (NULL if not applicable). |
trim |
Logical indicating whether trimming was applied. |
delta |
Symmetric trimming threshold (NULL if trim = FALSE). |
n |
Number of complete cases used in analysis. |
ps_result |
Propensity score estimation results (NULL if |
weight_result |
Weight estimation results (NULL if |
weights |
Vector of weights used in estimation (all 1s if |
Li, F., Morgan, K. L., & Zaslavsky, A. M. (2018). Balancing covariates via propensity score weighting. Journal of the American Statistical Association, 113(521), 390-400.
Li, F., & Li, F. (2019). Propensity score weighting for causal inference with multiple treatments. The Annals of Applied Statistics, 13(4), 2389-2415.
Yoshida, K., et al. (2019). Multinomial extension of propensity score trimming methods: A simulation study. American Journal of Epidemiology, 188(3), 609-616.
# Example 1: Classical (unweighted) Kaplan-Meier for binary treatment data(simdata_bin) result1 <- weightedKM( data = simdata_bin, treatment_var = "Z", time_var = "time", event_var = "event", weight_method = "none" ) plot(result1) # Example 2: Overlap-weighted Kaplan-Meier result2 <- weightedKM( data = simdata_bin, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", weight_method = "OW" ) summary(result2) # Example 3: IPW-weighted Kaplan-Meier for multiple treatments data(simdata_multi) result3 <- weightedKM( data = simdata_multi, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", weight_method = "IPW" ) plot(result3) # Example 4: ATT with symmetric trimming result4 <- weightedKM( data = simdata_multi, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", weight_method = "ATT", att_group = "A", trim = TRUE, delta = 0.1 ) summary(result4)# Example 1: Classical (unweighted) Kaplan-Meier for binary treatment data(simdata_bin) result1 <- weightedKM( data = simdata_bin, treatment_var = "Z", time_var = "time", event_var = "event", weight_method = "none" ) plot(result1) # Example 2: Overlap-weighted Kaplan-Meier result2 <- weightedKM( data = simdata_bin, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", weight_method = "OW" ) summary(result2) # Example 3: IPW-weighted Kaplan-Meier for multiple treatments data(simdata_multi) result3 <- weightedKM( data = simdata_multi, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", weight_method = "IPW" ) plot(result3) # Example 4: ATT with symmetric trimming result4 <- weightedKM( data = simdata_multi, treatment_var = "Z", ps_formula = Z ~ X1 + X2 + X3 + B1 + B2, time_var = "time", event_var = "event", weight_method = "ATT", att_group = "A", trim = TRUE, delta = 0.1 ) summary(result4)