Inference Module (inference)

Advanced inference methods for difference-in-differences estimation.

This module provides inference methods beyond standard asymptotic approaches, including wild cluster bootstrap for reliable inference with few clusters.

Overview

Standard cluster-robust standard errors require a large number of clusters for valid asymptotic inference. When the number of clusters is small (typically fewer than 20-30), wild cluster bootstrap methods provide more accurate inference by resampling cluster-level weights rather than relying on asymptotic approximations.

This module implements the wild cluster bootstrap procedure with extensions for difference-in-differences settings.

Wild Cluster Bootstrap

class lwdid.inference.WildClusterBootstrapResult(att, se_bootstrap, ci_lower, ci_upper, pvalue, n_clusters, n_bootstrap, weight_type, t_stat_original, t_stats_bootstrap, rejection_rate, ci_method='percentile_t')[source]

Result of wild cluster bootstrap inference.

This class contains the results of wild cluster bootstrap, which provides more reliable inference when the number of clusters is small.

Variables:
  • att (float) – Point estimate of ATT.

  • se_bootstrap (float) – Bootstrap standard error.

  • ci_lower (float) – Lower bound of bootstrap confidence interval.

  • ci_upper (float) – Upper bound of bootstrap confidence interval.

  • pvalue (float) – Bootstrap p-value (two-sided).

  • n_clusters (int) – Number of clusters.

  • n_bootstrap (int) – Number of bootstrap replications.

  • weight_type (str) – Type of bootstrap weights used.

  • t_stat_original (float) – Original t-statistic.

  • t_stats_bootstrap (np.ndarray) – Bootstrap t-statistics.

  • rejection_rate (float) – Proportion of bootstrap t-stats exceeding original.

  • ci_method (str) – Method used for CI construction (‘percentile_t’ or ‘test_inversion’).

att: float
se_bootstrap: float
ci_lower: float
ci_upper: float
pvalue: float
n_clusters: int
n_bootstrap: int
weight_type: str
t_stat_original: float
t_stats_bootstrap: ndarray
rejection_rate: float
ci_method: str = 'percentile_t'
summary()[source]

Generate human-readable summary of bootstrap results.

Returns:

Formatted summary string.

Return type:

str

lwdid.inference.wild_cluster_bootstrap(data, y_transformed, d, cluster_var, controls=None, n_bootstrap=999, weight_type='rademacher', alpha=0.05, seed=None, impose_null=True, full_enumeration=None, use_wildboottest=False)[source]

Perform wild cluster bootstrap for inference with few clusters.

Implements the wild cluster bootstrap of Cameron, Gelbach, and Miller (2008). This method provides more reliable inference when the number of clusters is small (< 20-30).

Algorithm:

  1. Estimate original model and obtain residuals

  2. For each bootstrap replication: (a) Generate cluster-level weights (Rademacher/Mammen/Webb); (b) Construct bootstrap residuals u*_ic = w_c * u_ic; (c) Construct bootstrap outcome y*_ic = X_ic * beta_hat + u*_ic (if impose_null) or y*_ic = y_hat_ic + u*_ic (if not impose_null); (d) Re-estimate model and compute t-statistic

  3. Compute bootstrap p-value and confidence interval

Parameters:
  • data (pd.DataFrame) – Regression data.

  • y_transformed (str) – Transformed outcome variable (e.g., after within-transformation).

  • d (str) – Treatment indicator variable.

  • cluster_var (str) – Clustering variable.

  • controls (list[str], optional) – Control variables.

  • n_bootstrap (int, default 999) – Number of bootstrap replications. Should be odd for symmetric p-value calculation.

  • weight_type (str, default 'rademacher') – Bootstrap weight distribution: - ‘rademacher’: P(w=1) = P(w=-1) = 0.5 (simplest, most common) - ‘mammen’: Two-point distribution matching skewness - ‘webb’: Six-point distribution (best for very few clusters)

  • alpha (float, default 0.05) – Significance level for confidence interval.

  • seed (int, optional) – Random seed for reproducibility.

  • impose_null (bool, default True) – Whether to impose null hypothesis (H0: τ = 0) when constructing bootstrap samples. Recommended for hypothesis testing.

  • full_enumeration (bool, optional) – Whether to use full enumeration of all 2^G Rademacher weight combinations instead of random sampling. If None (default), automatically enabled when G <= 12 and weight_type=’rademacher’. Full enumeration produces deterministic results and is the most faithful implementation of the algorithm principle, as it computes the exact p-value without Monte Carlo error.

  • use_wildboottest (bool, default False) – Whether to use the wildboottest package for fast algorithm. Requires wildboottest to be installed. When True, uses the optimized implementation that matches Stata boottest exactly. Note: wildboottest uses slightly different boundary handling, resulting in p-values about 0.002 lower than the algorithm principle definition.

Returns:

Bootstrap results containing point estimate, standard error, confidence interval, p-value, and diagnostic information.

Return type:

WildClusterBootstrapResult

Notes

The wild cluster bootstrap is particularly useful when: - Number of clusters G < 30 - Cluster sizes are unbalanced - Few treated clusters

The method works by: 1. Estimating the original model to get residuals 2. Generating cluster-level random weights 3. Creating bootstrap residuals by multiplying original residuals by weights 4. Re-estimating the model with bootstrap outcomes 5. Computing the bootstrap distribution of t-statistics

When impose_null=True (recommended for hypothesis testing), the bootstrap outcome is constructed under the null hypothesis that the treatment effect is zero. This provides better size control.

See also

diagnose_clustering

Diagnose clustering structure.

recommend_clustering_level

Get recommendation for clustering level.

lwdid.inference.wild_cluster_bootstrap_test_inversion(data, y_transformed, d, cluster_var, controls=None, n_bootstrap=999, weight_type='rademacher', alpha=0.05, seed=None, grid_points=25, ci_tol=0.01)[source]

Compute wild cluster bootstrap confidence interval using test inversion.

This method constructs confidence intervals by inverting hypothesis tests: CI = {theta : p(theta) >= alpha}

That is, the CI consists of all null hypothesis values for which the bootstrap p-value exceeds the significance level.

Parameters:
  • data (pd.DataFrame) – Regression data.

  • y_transformed (str) – Transformed outcome variable.

  • d (str) – Treatment variable.

  • cluster_var (str) – Clustering variable.

  • controls (list[str], optional) – Control variables.

  • n_bootstrap (int, default 999) – Number of bootstrap replications.

  • weight_type (str, default 'rademacher') – Type of bootstrap weights.

  • alpha (float, default 0.05) – Significance level.

  • seed (int, optional) – Random seed for reproducibility.

  • grid_points (int, default 25) – Number of grid points for initial search.

  • ci_tol (float, default 0.01) – Tolerance for CI boundary precision.

Returns:

Results containing test inversion CI.

Return type:

WildClusterBootstrapResult

Notes

Advantages of test inversion CI:

  1. Can be more accurate than percentile-t CI in some settings.

  2. Handles asymmetric distributions appropriately.

Disadvantages:

  1. More computationally intensive (requires multiple bootstrap runs).

  2. Requires numerical optimization to find boundaries.

Example Usage

Basic Wild Cluster Bootstrap

from lwdid.inference import wild_cluster_bootstrap

# Run wild cluster bootstrap on transformed data
result = wild_cluster_bootstrap(
    data=transformed_data,
    y_transformed='y_dot',
    d='treated',
    cluster_var='state',
    n_bootstrap=999,
    weight_type='rademacher'
)

print(f"Original ATT: {result.att:.4f}")
print(f"Bootstrap SE: {result.se_bootstrap:.4f}")
print(f"Bootstrap p-value: {result.pvalue:.4f}")
print(f"95% CI: [{result.ci_lower:.4f}, {result.ci_upper:.4f}]")

Weight Types

Three bootstrap weight distributions are available:

# Rademacher weights (+1 or -1 with equal probability)
result_rad = wild_cluster_bootstrap(
    data, y_transformed='y_dot', d='treated',
    cluster_var='state', weight_type='rademacher'
)

# Mammen weights (two-point distribution)
result_mam = wild_cluster_bootstrap(
    data, y_transformed='y_dot', d='treated',
    cluster_var='state', weight_type='mammen'
)

# Webb weights (six-point distribution)
result_web = wild_cluster_bootstrap(
    data, y_transformed='y_dot', d='treated',
    cluster_var='state', weight_type='webb'
)

Test Inversion for Confidence Intervals

For more accurate confidence intervals with few clusters:

from lwdid.inference import wild_cluster_bootstrap_test_inversion

# Construct CI via test inversion
result = wild_cluster_bootstrap_test_inversion(
    data=transformed_data,
    y_transformed='y_dot',
    d='treated',
    cluster_var='state',
    alpha=0.05,
    n_bootstrap=999
)

print(f"CI via test inversion: [{result.ci_lower:.4f}, {result.ci_upper:.4f}]")

Integration with lwdid

Wild cluster bootstrap is used as a standalone function after running the main estimation. First estimate the model using lwdid(), then apply wild cluster bootstrap to the transformed data:

from lwdid import lwdid
from lwdid.inference import wild_cluster_bootstrap

# Step 1: Run standard estimation with cluster-robust SE
results = lwdid(
    data, y='outcome', d='treated', ivar='unit', tvar='year',
    post='post', rolling='demean',
    vce='cluster', cluster_var='state'
)

# Step 2: Apply wild cluster bootstrap to the transformed data
boot_result = wild_cluster_bootstrap(
    data=results.data,
    y_transformed='ydot_postavg',
    d='d_',
    cluster_var='state',
    n_bootstrap=999
)

print(f"Bootstrap p-value: {boot_result.pvalue:.4f}")

Complete Enumeration

With small numbers of clusters (G <= 12), exact enumeration of all possible weight combinations provides exact p-values:

# Complete enumeration with few clusters
result = wild_cluster_bootstrap(
    data=transformed_data,
    y_transformed='y_dot',
    d='treated',
    cluster_var='state',  # If G <= 12, complete enumeration is used
    weight_type='rademacher'
)

# Check if complete enumeration was used
if result.n_bootstrap == 2**result.n_clusters:
    print("Complete enumeration used - exact p-value")

Methodological Notes

Algorithm:

  1. Estimate the original model and obtain residuals \(\hat{u}_{ic}\)

  2. Generate cluster-level weights \(w_c\) from chosen distribution

  3. Construct bootstrap residuals: \(u^*_{ic} = w_c \times \hat{u}_{ic}\)

  4. Form bootstrap outcomes: \(Y^*_{ic} = \hat{Y}_{ic} + u^*_{ic}\)

  5. Re-estimate the model and compute t-statistic

  6. Repeat B times to obtain bootstrap distribution

  7. Compute p-value as proportion of bootstrap t-statistics exceeding observed

Weight Distributions:

  • Rademacher: \(P(w = 1) = P(w = -1) = 0.5\)

  • Mammen: Two-point distribution with \(E[w] = 0\), \(E[w^2] = 1\), \(E[w^3] = 1\)

  • Webb: Six-point distribution for improved performance with few clusters

Null Imposition:

The impose_null parameter determines whether bootstrap samples are constructed under the null hypothesis (H0: treatment effect = 0). Imposing the null generally improves power but may be conservative.

Guidelines

When to Use Wild Cluster Bootstrap:

  • Number of clusters < 30

  • Treatment varies at cluster level

  • Standard cluster-robust SEs may be unreliable

Recommended Settings:

  • n_bootstrap=999 or n_bootstrap=9999 for publication

  • weight_type='rademacher' is standard choice

  • impose_null=True for testing H0: effect = 0

See Also