Inference Module (inference)

Advanced inference methods for difference-in-differences estimation.

This module provides inference methods beyond standard asymptotic approaches, including wild cluster bootstrap for reliable inference with few clusters.

Overview

Standard cluster-robust standard errors require a large number of clusters for valid asymptotic inference. When the number of clusters is small (typically fewer than 20-30), wild cluster bootstrap methods provide more accurate inference by resampling cluster-level weights rather than relying on asymptotic approximations.

This module implements the wild cluster bootstrap procedure with extensions for difference-in-differences settings.

Wild Cluster Bootstrap

class lwdid.inference.WildClusterBootstrapResult(att, se_bootstrap, ci_lower, ci_upper, pvalue, n_clusters, n_bootstrap, weight_type, t_stat_original, t_stats_bootstrap, rejection_rate, ci_method='percentile_t')[source]

Result of wild cluster bootstrap inference.

This class contains the results of wild cluster bootstrap, which provides more reliable inference when the number of clusters is small.

Variables:

att (float) – Point estimate of ATT.
se_bootstrap (float) – Bootstrap standard error.
ci_lower (float) – Lower bound of bootstrap confidence interval.
ci_upper (float) – Upper bound of bootstrap confidence interval.
pvalue (float) – Bootstrap p-value (two-sided).
n_clusters (int) – Number of clusters.
n_bootstrap (int) – Number of bootstrap replications.
weight_type (str) – Type of bootstrap weights used.
t_stat_original (float) – Original t-statistic.
t_stats_bootstrap (np.ndarray) – Bootstrap t-statistics.
rejection_rate (float) – Proportion of bootstrap t-stats exceeding original.
ci_method (str) – Method used for CI construction (‘percentile_t’ or ‘test_inversion’).

att: float

se_bootstrap: float

ci_lower: float

ci_upper: float

pvalue: float

n_clusters: int

n_bootstrap: int

weight_type: str

t_stat_original: float

t_stats_bootstrap: ndarray

rejection_rate: float

ci_method: str = 'percentile_t'

summary()[source]

Generate human-readable summary of bootstrap results.

Returns:: Formatted summary string.
Return type:: str

lwdid.inference.wild_cluster_bootstrap(data, y_transformed, d, cluster_var, controls=None, n_bootstrap=999, weight_type='rademacher', alpha=0.05, seed=None, impose_null=True, full_enumeration=None, use_wildboottest=False)[source]

Perform wild cluster bootstrap for inference with few clusters.

Implements the wild cluster bootstrap of Cameron, Gelbach, and Miller (2008). This method provides more reliable inference when the number of clusters is small (< 20-30).

Algorithm:

Estimate original model and obtain residuals
For each bootstrap replication: (a) Generate cluster-level weights (Rademacher/Mammen/Webb); (b) Construct bootstrap residuals u*_ic = w_c * u_ic; (c) Construct bootstrap outcome y*_ic = X_ic * beta_hat + u*_ic (if impose_null) or y*_ic = y_hat_ic + u*_ic (if not impose_null); (d) Re-estimate model and compute t-statistic
Compute bootstrap p-value and confidence interval

Parameters:

data (pd.DataFrame) – Regression data.
y_transformed (str) – Transformed outcome variable (e.g., after within-transformation).
d (str) – Treatment indicator variable.
cluster_var (str) – Clustering variable.
controls (list[str], optional) – Control variables.
n_bootstrap (int, default 999) – Number of bootstrap replications. Should be odd for symmetric p-value calculation.
weight_type (str, default 'rademacher') – Bootstrap weight distribution: - ‘rademacher’: P(w=1) = P(w=-1) = 0.5 (simplest, most common) - ‘mammen’: Two-point distribution matching skewness - ‘webb’: Six-point distribution (best for very few clusters)
alpha (float, default 0.05) – Significance level for confidence interval.
seed (int, optional) – Random seed for reproducibility.
impose_null (bool, default True) – Whether to impose null hypothesis (H0: τ = 0) when constructing bootstrap samples. Recommended for hypothesis testing.
full_enumeration (bool, optional) – Whether to use full enumeration of all 2^G Rademacher weight combinations instead of random sampling. If None (default), automatically enabled when G <= 12 and weight_type=’rademacher’. Full enumeration produces deterministic results and is the most faithful implementation of the algorithm principle, as it computes the exact p-value without Monte Carlo error.
use_wildboottest (bool, default False) – Whether to use the wildboottest package for fast algorithm. Requires wildboottest to be installed. When True, uses the optimized implementation that matches Stata boottest exactly. Note: wildboottest uses slightly different boundary handling, resulting in p-values about 0.002 lower than the algorithm principle definition.

Returns:

Bootstrap results containing point estimate, standard error, confidence interval, p-value, and diagnostic information.

Return type:

WildClusterBootstrapResult

Notes

The wild cluster bootstrap is particularly useful when: - Number of clusters G < 30 - Cluster sizes are unbalanced - Few treated clusters

The method works by: 1. Estimating the original model to get residuals 2. Generating cluster-level random weights 3. Creating bootstrap residuals by multiplying original residuals by weights 4. Re-estimating the model with bootstrap outcomes 5. Computing the bootstrap distribution of t-statistics

When impose_null=True (recommended for hypothesis testing), the bootstrap outcome is constructed under the null hypothesis that the treatment effect is zero. This provides better size control.

Example Usage

Basic Wild Cluster Bootstrap

from lwdid.inference import wild_cluster_bootstrap

# Run wild cluster bootstrap on transformed data
result = wild_cluster_bootstrap(
    data=transformed_data,
    y_transformed='y_dot',
    d='treated',
    cluster_var='state',
    n_bootstrap=999,
    weight_type='rademacher'
)

print(f"Original ATT: {result.att:.4f}")
print(f"Bootstrap SE: {result.se_bootstrap:.4f}")
print(f"Bootstrap p-value: {result.pvalue:.4f}")
print(f"95% CI: [{result.ci_lower:.4f}, {result.ci_upper:.4f}]")

Weight Types

Three bootstrap weight distributions are available:

# Rademacher weights (+1 or -1 with equal probability)
result_rad = wild_cluster_bootstrap(
    data, y_transformed='y_dot', d='treated',
    cluster_var='state', weight_type='rademacher'
)

# Mammen weights (two-point distribution)
result_mam = wild_cluster_bootstrap(
    data, y_transformed='y_dot', d='treated',
    cluster_var='state', weight_type='mammen'
)

# Webb weights (six-point distribution)
result_web = wild_cluster_bootstrap(
    data, y_transformed='y_dot', d='treated',
    cluster_var='state', weight_type='webb'
)

Test Inversion for Confidence Intervals

For more accurate confidence intervals with few clusters:

from lwdid.inference import wild_cluster_bootstrap_test_inversion

# Construct CI via test inversion
result = wild_cluster_bootstrap_test_inversion(
    data=transformed_data,
    y_transformed='y_dot',
    d='treated',
    cluster_var='state',
    alpha=0.05,
    n_bootstrap=999
)

print(f"CI via test inversion: [{result.ci_lower:.4f}, {result.ci_upper:.4f}]")

Integration with lwdid

Wild cluster bootstrap is used as a standalone function after running the main estimation. First estimate the model using lwdid(), then apply wild cluster bootstrap to the transformed data:

from lwdid import lwdid
from lwdid.inference import wild_cluster_bootstrap

# Step 1: Run standard estimation with cluster-robust SE
results = lwdid(
    data, y='outcome', d='treated', ivar='unit', tvar='year',
    post='post', rolling='demean',
    vce='cluster', cluster_var='state'
)

# Step 2: Apply wild cluster bootstrap to the transformed data
boot_result = wild_cluster_bootstrap(
    data=results.data,
    y_transformed='ydot_postavg',
    d='d_',
    cluster_var='state',
    n_bootstrap=999
)

print(f"Bootstrap p-value: {boot_result.pvalue:.4f}")

Complete Enumeration

With small numbers of clusters (G <= 12), exact enumeration of all possible weight combinations provides exact p-values:

# Complete enumeration with few clusters
result = wild_cluster_bootstrap(
    data=transformed_data,
    y_transformed='y_dot',
    d='treated',
    cluster_var='state',  # If G <= 12, complete enumeration is used
    weight_type='rademacher'
)

# Check if complete enumeration was used
if result.n_bootstrap == 2**result.n_clusters:
    print("Complete enumeration used - exact p-value")

Methodological Notes

Algorithm:

Estimate the original model and obtain residuals \(\hat{u}_{ic}\)
Generate cluster-level weights \(w_c\) from chosen distribution
Construct bootstrap residuals: \(u^*_{ic} = w_c \times \hat{u}_{ic}\)
Form bootstrap outcomes: \(Y^*_{ic} = \hat{Y}_{ic} + u^*_{ic}\)
Re-estimate the model and compute t-statistic
Repeat B times to obtain bootstrap distribution
Compute p-value as proportion of bootstrap t-statistics exceeding observed

Weight Distributions:

Rademacher: \(P(w = 1) = P(w = -1) = 0.5\)
Mammen: Two-point distribution with \(E[w] = 0\), \(E[w^2] = 1\), \(E[w^3] = 1\)
Webb: Six-point distribution for improved performance with few clusters

Null Imposition:

The impose_null parameter determines whether bootstrap samples are constructed under the null hypothesis (H0: treatment effect = 0). Imposing the null generally improves power but may be conservative.

Guidelines

When to Use Wild Cluster Bootstrap:

Number of clusters < 30
Treatment varies at cluster level
Standard cluster-robust SEs may be unreliable

Recommended Settings:

n_bootstrap=999 or n_bootstrap=9999 for publication
weight_type='rademacher' is standard choice
impose_null=True for testing H0: effect = 0