Randomization Inference Module ============================== Hypothesis testing via randomization inference for small and large samples. This module implements randomization inference (RI) for testing the sharp null hypothesis of no treatment effect in difference-in-differences settings. The approach provides valid p-values without relying on asymptotic distributional assumptions. **Applicability**: RI is applicable to both small-sample and large-sample scenarios. While it is particularly valuable for small samples where t-based inference may be unreliable, it also serves as a robust alternative for large samples when normality assumptions are questionable. See Lee and Wooldridge (2026) for discussion of RI in the small-sample context. .. automodule:: lwdid.randomization :no-members: Main Function ------------- .. autofunction:: lwdid.randomization.randomization_inference :no-index: Resampling Methods ------------------ The implementation supports two resampling methods: **Permutation (Classical Fisher Randomization Inference)** Permutes treatment labels without replacement, preserving the original number of treated and control units in each replication. This is the classical Fisher randomization approach and is generally recommended for design-based randomization inference. **Bootstrap (Resampling with Replacement)** Resamples treatment labels with replacement. May produce degenerate draws (all treated or all control) which are excluded from p-value computation. This method is the default for backward compatibility. Parameters ---------- firstpost_df : pd.DataFrame Cross-sectional data for the first post-treatment period, containing one observation per unit. y_col : str, default 'ydot_postavg' Column name of the transformed outcome variable. d_col : str, default ``'d_'`` Column name of the binary treatment indicator. ivar : str, default 'ivar' Column name of the unit identifier. rireps : int, default 1000 Number of randomization replications. Higher values provide more precise p-values but increase computation time. seed : int or None, optional Random seed for reproducibility. ri_method : {'bootstrap', 'permutation'}, default 'bootstrap' Resampling method for generating the null distribution. controls : list of str or None, optional Control variables to include in the regression model. Returns ------- dict Dictionary containing: - ``p_value``: Two-sided p-value (proportion with \|ATT_perm\| >= \|ATT_obs\|) - ``ri_method``: Resampling method used - ``ri_reps``: Total replications requested - ``ri_valid``: Number of valid replications - ``ri_failed``: Number of failed replications - ``ri_failure_rate``: Proportion of failed replications Example Usage ------------- .. code-block:: python from lwdid import lwdid # Estimation with randomization inference results = lwdid( data, y='outcome', d='treated', ivar='unit', tvar='year', post='post', rolling='detrend', ri=True, rireps=1000, ri_method='permutation', seed=42 ) # Access RI results print(f"RI p-value: {results.ri_pvalue:.4f}") print(f"Method: {results.ri_method}") print(f"Valid replications: {results.ri_valid}/{results.rireps}") Methodological Notes -------------------- Randomization inference tests the sharp null hypothesis that all unit-level treatment effects are exactly zero. Under this null, treatment assignment is uninformative about potential outcomes, and permuting treatment labels generates the null distribution of the test statistic. **Advantages** - Does not rely on normality or homoskedasticity assumptions - Naturally accommodates heteroskedasticity and non-normality - Provides exact p-values under the randomization model **Limitations** - Computationally intensive for large numbers of replications - Tests only the sharp null hypothesis (zero effect for all units) - Bootstrap method may have high failure rates with extreme treatment proportions See Also -------- :func:`lwdid.lwdid` : Main estimation function with ``ri=True`` option. :doc:`../methodological_notes` : Theoretical foundations.