User Guide ========== This guide covers all aspects of using the ``lwdid`` package for difference-in-differences estimation with rolling transformations. .. contents:: Table of Contents :local: :depth: 2 Overview -------- The ``lwdid`` package implements the Lee and Wooldridge methods for difference-in- differences estimation with panel data, covering three main scenarios: **Small-Sample Common Timing** (Lee and Wooldridge 2026): - Exact t-based inference under CLM assumptions (normality and homoskedasticity) - Designed for settings with small numbers of treated or control units - Works best with large time dimensions **Large-Sample Common Timing** (Lee and Wooldridge 2025): - Asymptotic inference with robust standard errors - Supports heteroskedasticity-robust (HC0-HC4) and cluster-robust options - Multiple estimators: RA, IPW, IPWRA, PSM **Staggered Adoption** (Lee and Wooldridge 2025): - Units treated at different times - Cohort-time specific effect estimation - Flexible control group strategies and aggregation options Key features: - Four transformation methods: demean, detrend, demeanq, detrendq - Multiple variance estimators for different assumptions - Randomization inference for small samples - Event study visualization for staggered designs Scenario Selection Guide ------------------------ Use this guide to choose the appropriate scenario for your analysis: - **Small-sample common timing**: Use when N is small with common treatment timing. Key parameters: ``post``, ``vce=None``. - **Large-sample common timing**: Use when N is larger with common treatment timing. Key parameters: ``post``, ``vce='hc3'``. - **Staggered adoption**: Use when units are treated at different times. Key parameters: ``gvar``, ``aggregate``. **Decision Flowchart:** .. code-block:: text ┌─────────────────────────────────┐ │ Treatment timing varies │ │ across units? │ └───────────────┬─────────────────┘ │ ┌───────────────┴───────────────┐ │ │ ▼ ▼ ┌────────┐ ┌─────────┐ │ YES │ │ NO │ └────┬───┘ └────┬────┘ │ │ ▼ ▼ ┌───────────────────────────┐ ┌─────────────────────────────┐ │ STAGGERED ADOPTION │ │ Is cross-sectional N │ │ Use: gvar parameter │ │ small (< 50)? │ │ Estimator: ra/ipwra │ └───────────────┬─────────────┘ │ Aggregate: cohort/overall│ │ └───────────────────────────┘ ┌───────────────┴───────────────┐ │ │ ▼ ▼ ┌────────┐ ┌─────────┐ │ YES │ │ NO │ └────┬───┘ └────┬────┘ │ │ ▼ ▼ ┌─────────────────────────────────────┐ ┌─────────────────────────────────┐ │ SMALL-SAMPLE COMMON TIMING │ │ LARGE-SAMPLE COMMON TIMING │ │ Use: post parameter │ │ Use: post parameter │ │ VCE: None (exact t-inference) │ │ VCE: hc3 or robust │ │ Alt: ri=True for RI │ │ Estimator: ra/ipwra │ │ Reference: Lee and Wooldridge │ │ Reference: Lee and Wooldridge │ │ (2026) │ │ (2025) │ └─────────────────────────────────────┘ └─────────────────────────────────┘ Quick Reference --------------- For a quick start, see :doc:`quickstart`. For complete API details, see :doc:`api/index`. Transformation Methods ---------------------- Four transformation methods convert panel data into cross-sectional form. The choice depends on data structure and assumptions. demean: Standard DiD with Unit Fixed Effects ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **When to use**: - Annual data or data without seasonal patterns - Control for time-invariant unit characteristics - At least 1 pre-treatment period (:math:`T_0 \geq 1`) **What it does**: Removes unit-specific pre-treatment means from each observation. **Mathematical form** (Lee and Wooldridge 2026): For each unit :math:`i`, compute the pre-treatment mean :math:`\bar{Y}_{i,pre}` and transform: .. math:: \dot{Y}_{it} = Y_{it} - \bar{Y}_{i,pre} \quad \text{for all } t Then estimate a cross-sectional regression on the transformed data. Example: .. code-block:: python results = lwdid( data, y='outcome', d='treated', ivar='unit', tvar='year', post='post', rolling='demean' ) detrend: DiD with Unit-Specific Linear Trends ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **When to use**: - Units exhibit different linear trends in the pre-treatment period - Control for both unit fixed effects and unit-specific trends - At least 2 pre-treatment periods (:math:`T_0 \geq 2`) **What it does**: Estimates and removes unit-specific linear trends from the pre-treatment data. **Mathematical form** (Lee and Wooldridge 2026): For each unit :math:`i`, estimate linear trend from pre-treatment data: .. math:: Y_{it} = \alpha_i + \beta_i t + \varepsilon_{it} \quad \text{(for } t \text{ in pre-treatment period)} Then detrend all observations using the estimated trend. Example: .. code-block:: python results = lwdid( data, y='outcome', d='treated', ivar='unit', tvar='year', post='post', rolling='detrend' ) demeanq: Seasonal Data with Seasonal Fixed Effects ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **When to use**: - Periodic data with seasonal patterns (quarterly, monthly, or weekly) - Control for both unit and season-of-year effects - Pre-treatment sample rich enough to estimate seasonal fixed effects for each unit: for each unit, the number of pre-treatment observations must be at least Q + 1 (where Q is the number of seasons per year) - For each unit, every season that appears in the post-treatment period must also appear in its pre-treatment period (season-coverage condition). **What it does**: Removes unit-specific pre-treatment means and season-of-year effects. **Supported seasonal periods (Q parameter)**: - **Q=4** (default): Quarterly data (4 seasons per year) - **Q=12**: Monthly data (12 seasons per year) - **Q=52**: Weekly data (52 seasons per year) **Data requirements**: - For quarterly data: Pass ``tvar`` as a list: ``tvar=['year', 'quarter']`` - For monthly/weekly data: Use a single time variable with ``Q`` and ``season_var`` parameters **Examples**: Quarterly data (Q=4, default): .. code-block:: python results = lwdid( data_q, y='sales', d='treated', ivar='store', tvar=['year', 'quarter'], # Composite time variable post='post', rolling='demeanq' ) Monthly data (Q=12): .. code-block:: python results = lwdid( data_m, y='sales', d='treated', ivar='store', tvar='time', # Single time index post='post', rolling='demeanq', Q=12, # 12 seasons per year season_var='month' # Month indicator (1-12) ) Weekly data (Q=52): .. code-block:: python results = lwdid( data_w, y='sales', d='treated', ivar='store', tvar='time', # Single time index post='post', rolling='demeanq', Q=52, # 52 seasons per year season_var='week' # Week indicator (1-52) ) detrendq: Seasonal Data with Trends and Seasonal Effects ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **When to use**: - Periodic data with both trends and seasonal patterns - Control for unit trends and season-of-year effects - At least Q + 2 pre-treatment observations per unit (where Q is the number of seasons per year) - For each unit, every season that appears in the post-treatment period must also appear in its pre-treatment period (season-coverage condition). **What it does**: Combines detrending with seasonal adjustment. **Supported seasonal periods (Q parameter)**: - **Q=4** (default): Quarterly data (4 seasons per year) - **Q=12**: Monthly data (12 seasons per year) - **Q=52**: Weekly data (52 seasons per year) **Examples**: Quarterly data (Q=4, default): .. code-block:: python results = lwdid( data_q, y='sales', d='treated', ivar='store', tvar=['year', 'quarter'], post='post', rolling='detrendq' ) Monthly data (Q=12): .. code-block:: python results = lwdid( data_m, y='sales', d='treated', ivar='store', tvar='time', post='post', rolling='detrendq', Q=12, season_var='month' ) Weekly data (Q=52): .. code-block:: python results = lwdid( data_w, y='sales', d='treated', ivar='store', tvar='time', post='post', rolling='detrendq', Q=52, season_var='week' ) Automatic Frequency Detection ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For convenience, ``lwdid`` can automatically detect the data frequency and set the appropriate ``Q`` parameter: .. code-block:: python results = lwdid( data, y='sales', d='treated', ivar='store', tvar='time', post='post', rolling='demeanq', auto_detect_frequency=True # Automatically detect Q ) When ``auto_detect_frequency=True``, the package examines the time variable to determine the most likely data frequency: - Detects monthly data (Q=12) based on 12 observations per year pattern - Detects weekly data (Q=52) based on 52 observations per year pattern - Falls back to quarterly (Q=4) if frequency cannot be determined **Note**: If both an explicit ``Q`` (other than the default 4) and ``auto_detect_frequency=True`` are specified, and the detected frequency differs from the explicit ``Q``, a warning is issued and the explicit ``Q`` value takes precedence. No error is raised. Variance Estimation ------------------- The package supports multiple variance estimators for different assumptions about the error structure. OLS (Homoskedastic) ~~~~~~~~~~~~~~~~~~~ **When to use**: - Errors are homoskedastic and normally distributed - Exact t-based inference is desired **Specification**: .. code-block:: python results = lwdid(..., vce=None) # Default **Degrees of freedom**: - Non-clustered: df equals the residual degrees of freedom from the cross-sectional regression (df_resid). In a simple specification without controls this is N - 2 (intercept and treatment indicator); with controls and interactions it is N minus the total number of estimated parameters. - Clustered: df = G - 1 (G clusters) HC0 (White's Heteroskedasticity-Consistent) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **When to use**: - Heteroskedasticity is suspected - Sample size is large **Specification**: .. code-block:: python results = lwdid(..., vce='hc0') **Note**: HC0 is the original heteroskedasticity-consistent estimator without finite-sample adjustments. HC1 (Heteroskedasticity-Robust) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **When to use**: - Heteroskedasticity is suspected - Sample size is moderate to large, so asymptotic approximations are more reliable **Specification**: .. code-block:: python results = lwdid(..., vce='hc1') # or equivalently: results = lwdid(..., vce='robust') Note: ``'hc1'`` and ``'robust'`` are aliases for the same estimator. HC1 applies a degrees-of-freedom correction (n/(n-k)) to HC0. HC2 (Leverage-Adjusted) ~~~~~~~~~~~~~~~~~~~~~~~~ **When to use**: - Heteroskedasticity is suspected - Sample size is small to moderate - Observations have varying leverage **Specification**: .. code-block:: python results = lwdid(..., vce='hc2') **Note**: HC2 divides squared residuals by (1 - h_ii) where h_ii is the diagonal of the hat matrix. Provides better performance than HC1 when leverage varies across observations. HC3 (Small-Sample Adjusted Robust) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **When to use**: - Heteroskedasticity is suspected - Sample size is small or moderate **Specification**: .. code-block:: python results = lwdid(..., vce='hc3') **Advantage**: HC3 divides squared residuals by (1 - h_ii)^2, providing better finite-sample performance than HC1 and HC2 in small samples. HC4 (High-Leverage Adjusted) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **When to use**: - Heteroskedasticity is suspected - Data contains high-leverage observations **Specification**: .. code-block:: python results = lwdid(..., vce='hc4') **Note**: HC4 uses an adaptive exponent that increases the adjustment for observations with high leverage. Recommended when the design matrix contains potentially influential observations. Cluster-Robust ~~~~~~~~~~~~~~ **When to use**: - Errors are correlated within clusters (e.g., states, regions) - Multiple units per cluster **Specification**: .. code-block:: python results = lwdid(..., vce='cluster', cluster_var='state') **Degrees of freedom**: G - 1, where G is the number of clusters. **Clustering Level Selection**: Lee and Wooldridge (2025) provides guidance on choosing the clustering level. The clustering level should match the level at which the policy varies: - If the unit of observation is county but policy varies at the state level, cluster at the state level - If the unit of observation is individual but policy varies at the firm level, cluster at the firm level In general, cluster at the highest level where treatment assignment is determined. This ensures that standard errors properly account for the correlation structure induced by the policy variation. Randomization Inference ----------------------- Randomization inference provides an alternative to t-based inference without distributional assumptions. Basic Usage ~~~~~~~~~~~ .. code-block:: python results = lwdid( data, y='outcome', d='treated', ivar='unit', tvar='year', post='post', rolling='demean', ri=True, # Enable RI rireps=1000, # Number of randomization replications (default: 1000) seed=42 # Random seed for reproducibility ) print(f"RI p-value: {results.ri_pvalue:.4f}") RI Methods ~~~~~~~~~~ The package supports two RI methods: **Bootstrap** (default): .. code-block:: python results = lwdid(..., ri=True, ri_method='bootstrap') With-replacement resampling. **Permutation**: .. code-block:: python results = lwdid(..., ri=True, ri_method='permutation') Fisher randomization inference without replacement. Accessing RI Results ~~~~~~~~~~~~~~~~~~~~ .. code-block:: python results.ri_pvalue # RI p-value (available if ri=True) results.ri_method # RI method ('permutation' or 'bootstrap') results.rireps # Number of RI replications results.ri_valid # Number of valid RI replications results.ri_failed # Number of failed RI replications results.ri_seed # Random seed actually used for RI (auto-generated if not provided) Control Variables ----------------- Time-invariant control variables can be included in the estimation. Basic Usage ~~~~~~~~~~~ .. code-block:: python results = lwdid( data, y='outcome', d='treated', ivar='unit', tvar='year', post='post', rolling='demean', controls=['population', 'income', 'education'] ) Requirements ~~~~~~~~~~~~ - Controls must be time-invariant (constant within each unit) - Controls are automatically centered (demeaned) - Missing values in controls are handled at the estimation stage: observations with missing controls may be dropped if doing so still leaves enough treated and control units to satisfy the N1>K+1 and N0>K+1 conditions; otherwise controls are omitted and the full regression sample is retained (with a warning) Effect on Inference ~~~~~~~~~~~~~~~~~~~ Including controls: - Reduces residual variance (potentially increasing power) - Reduces degrees of freedom by the number of added regressors (controls and their interactions) - Controls are centered before inclusion in the cross-sectional regression - Controls are included only if both the treated and control groups satisfy N1>K+1 and N0>K+1 (where K is the number of controls); otherwise, controls are automatically excluded and a warning is issued Working with Results -------------------- The ``lwdid()`` function returns a ``LWDIDResults`` object with comprehensive estimation results and export capabilities. Core Estimates ~~~~~~~~~~~~~~ .. code-block:: python results.att # Average treatment effect on treated results.se_att # Standard error results.t_stat # t-statistic results.pvalue # Two-sided p-value results.ci_lower # 95% CI lower bound results.ci_upper # 95% CI upper bound results.df_inference # Degrees of freedom Period-Specific Effects ~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Access period-by-period estimates df_periods = results.att_by_period # DataFrame with columns: # - period: Time period label # - tindex: Numeric time index # - beta: Treatment effect estimate # - se: Standard error # - ci_lower, ci_upper: 95% confidence interval # - tstat: t-statistic # - pval: p-value # - N: Sample size # The first row summarizes the overall average effect ('period' == 'average'). Sample Information ~~~~~~~~~~~~~~~~~~ .. code-block:: python results.nobs # Total observations in regression results.n_treated # Number of treated units results.n_control # Number of control units results.K # Number of pre-treatment periods results.tpost1 # First post-treatment period index Visualization ~~~~~~~~~~~~~ .. code-block:: python # Plot treatment effects over time results.plot() # Customize the plot results.plot(graph_options={'figsize': (10, 6), 'title': 'Treatment Effects Over Time', 'ylabel': 'Effect Size'}) Export Results ~~~~~~~~~~~~~~ Excel: .. code-block:: python results.to_excel('results.xlsx') # Creates sheets: Summary, ByPeriod, and RI (if available) CSV: .. code-block:: python results.to_csv('results.csv') # Exports period-by-period estimates LaTeX: .. code-block:: python results.to_latex('table.tex') # Creates publication-ready LaTeX table Print Summary ~~~~~~~~~~~~~ .. code-block:: python results.summary() # Prints formatted summary to console Data Requirements and Validation --------------------------------- Panel Structure ~~~~~~~~~~~~~~~ **Required format**: - Long format: one row per (unit, time) observation - Unique (unit, time) combinations - Continuous time sequence (no gaps in time index) **Validation**: The package automatically validates: - No duplicate (unit, time) pairs - Time index forms continuous sequence - Sufficient pre-treatment periods for chosen transformation Treatment Structure ~~~~~~~~~~~~~~~~~~~ **Common timing assumption**: All treated units must start treatment in the same period. The ``post`` variable must be a function of time only. **Treatment persistence**: Once ``post`` switches from 0 to 1, it must remain 1 for all subsequent periods (no reversals). **Validation**: The package checks: - ``post`` is binary (0 or 1) - ``post`` is time-invariant across units in each period - No treatment reversals Treatment Indicator ~~~~~~~~~~~~~~~~~~~ The ``d`` parameter must be a time-invariant treatment group indicator: - ``d = 1`` for all periods if unit is in treatment group - ``d = 0`` for all periods if unit is in control group - ``d`` should NOT vary over time Common mistake: Passing time-varying treatment status instead of group indicator. Missing Data ~~~~~~~~~~~~ **Allowed**: - Missing values in outcome variable ``y`` (observations dropped) - Unbalanced panels (different units can have different numbers of periods) **Not allowed**: - Rows with missing values in ``d``, ``ivar``, ``tvar``, or ``post`` (these observations are dropped during validation) - Gaps in time sequence Handling Unbalanced Panels ~~~~~~~~~~~~~~~~~~~~~~~~~~ When the panel is unbalanced (different units observed in different periods), the ``balanced_panel`` parameter controls behavior: - ``balanced_panel='warn'`` (default): Issue a warning and proceed with estimation - ``balanced_panel='error'``: Raise an exception; require a balanced panel - ``balanced_panel='ignore'``: Silently proceed without any warning .. code-block:: python # Default: warn about unbalanced panel but proceed results = lwdid(data, y='outcome', ..., balanced_panel='warn') # Strict: require balanced panel results = lwdid(data, y='outcome', ..., balanced_panel='error') # Silent: ignore panel imbalance results = lwdid(data, y='outcome', ..., balanced_panel='ignore') **Selection mechanism assumption**: With unbalanced panels, selection into the sample may depend on time-invariant characteristics (removed by the transformation), but should not depend on outcome shocks. See :doc:`methodological_notes` for details. **Tip**: Use ``detrend`` instead of ``demean`` for additional robustness to selection bias in unbalanced panels, as detrending removes both level and trend heterogeneity. For diagnosing potential selection issues, use the selection diagnostics tools: .. code-block:: python from lwdid import diagnose_selection_mechanism diag = diagnose_selection_mechanism(data, ivar='unit', tvar='year') print(diag.summary()) See :doc:`api/selection_diagnostics` for detailed documentation. Never-Treated Unit Data Preparation ------------------------------------ In staggered adoption designs, never-treated units serve as the control group. Proper encoding of never-treated units is essential for correct estimation. Valid Never-Treated Encodings ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ``gvar`` column can encode never-treated units using any of these values: 1. **Zero (0)**: Common in Stata and other software .. code-block:: python data['gvar'] = data['gvar'].fillna(0) # Convert NaN to 0 2. **Positive infinity (np.inf)**: Represents "treated at infinity" .. code-block:: python import numpy as np data.loc[data['never_treated'], 'gvar'] = np.inf 3. **NaN/NA/None**: Missing treatment time indicates never-treated .. code-block:: python # NaN is automatically recognized as never-treated data['gvar'] = data['first_treat_year'] # NaN for never-treated All three encodings are equivalent and produce identical results. Converting from Stata Data ~~~~~~~~~~~~~~~~~~~~~~~~~~ When importing data from Stata, never-treated units may be encoded differently: .. code-block:: python import pandas as pd import numpy as np # Load Stata data data = pd.read_stata('staggered_data.dta') # Common Stata encodings for never-treated: # - Missing values (.) → automatically become NaN in pandas # - Zero (0) → recognized as never-treated # - Large values (9999) → need manual conversion # Convert custom encoding to standard format data['gvar'] = data['gvar'].replace({9999: 0, -1: 0}) # Verify never-treated identification from lwdid.validation import is_never_treated unit_gvar = data.groupby('unit')['gvar'].first() n_nt = unit_gvar.apply(is_never_treated).sum() print(f"Never-treated units: {n_nt}") Checking Never-Treated Status ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use the ``is_never_treated()`` function to verify encoding: .. code-block:: python from lwdid.validation import is_never_treated import numpy as np # Check individual values print(is_never_treated(0)) # True print(is_never_treated(np.inf)) # True print(is_never_treated(np.nan)) # True print(is_never_treated(2005)) # False # Check DataFrame unit_gvar = data.groupby('unit')['gvar'].first() nt_mask = unit_gvar.apply(is_never_treated) print(f"Total units: {len(unit_gvar)}") print(f"Never-treated: {nt_mask.sum()}") print(f"Treated: {(~nt_mask).sum()}") Common Issues and Solutions ~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Issue: No never-treated units found** .. code-block:: python # Check gvar values print(data['gvar'].unique()) # If using custom encoding (e.g., 9999 for never-treated) data['gvar'] = data['gvar'].replace({9999: 0}) **Issue: Negative gvar values** .. code-block:: python # Negative values are invalid if (data['gvar'] < 0).any(): print("Warning: Negative gvar values found") # Convert to valid encoding data.loc[data['gvar'] < 0, 'gvar'] = 0 **Issue: gvar varies within unit** .. code-block:: python # gvar must be time-invariant gvar_varies = data.groupby('unit')['gvar'].nunique() > 1 if gvar_varies.any(): print(f"Units with varying gvar: {gvar_varies.sum()}") # Use first observation's gvar first_gvar = data.groupby('unit')['gvar'].first() data['gvar'] = data['unit'].map(first_gvar) Best Practices -------------- Choosing Transformation Method ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. **Start with ``demean``** for annual data 2. **Use ``detrend``** if pre-treatment trends differ across units 3. **Use ``demeanq``/``detrendq``** for quarterly data with seasonality 4. **Check pre-treatment period requirements** (:math:`T_0 \geq 1` for demean, :math:`T_0 \geq 2` for detrend) Choosing Variance Estimator ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. **Default (``vce=None``)**: Appropriate when homoskedasticity is plausible 2. **``vce='hc3'``**: Recommended for small to moderate sample sizes when heteroskedasticity is suspected 3. **``vce='cluster'``**: Appropriate when errors are clustered 4. **Randomization inference**: Use as robustness check or when normality is doubtful Sample Size Considerations ~~~~~~~~~~~~~~~~~~~~~~~~~~~ - **Small samples (N < 50):** Use exact t-based inference (``vce=None``) under CLM assumptions (normality and homoskedasticity). Minimum requirement is :math:`N \geq 3` units. Having at least 10 units improves stability. - **Moderate to large samples (N >= 50):** Use heteroskedasticity-robust standard errors (``vce='hc3'``) for valid asymptotic inference. - **Large samples with controls:** Consider IPWRA (``estimator='ipwra'``) for doubly robust estimation when functional form assumptions are uncertain. - **Time dimension:** Method works best with large T (many time periods), where the central limit theorem across time supports normality of the transformed outcome. - **Asymptotic theory:** Lee and Wooldridge (2025) develops the theoretical foundation for large-sample inference, supporting HC0-HC4 and cluster-robust standard errors. Sample Size and Inference Method Selection Matrix ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following matrix provides guidance on selecting the appropriate estimator and inference method based on sample size: **Small Samples (N < 50):** - **Recommended**: RA with ``vce=None`` (exact t-based inference) - **Alternative**: RA with ``ri=True`` (randomization inference) - **VCE options**: ``vce=None`` or ``vce='hc3'`` - **Inference distribution**: t-distribution - **Key assumption**: CLM assumptions (normality, homoskedasticity) for exact inference **Moderate Samples (50 ≤ N < 200):** - **Recommended**: RA or IPWRA with ``vce='hc3'`` - **VCE options**: ``vce='hc1'``, ``vce='hc3'``, or ``vce='cluster'`` - **Inference distribution**: t-distribution (RA) or normal (IPW/IPWRA/PSM) - **Key consideration**: HC3 provides better finite-sample performance **Large Samples (N ≥ 200):** - **Recommended**: IPWRA with ``vce='hc1'`` or ``vce='robust'`` - **Alternative**: RA, IPW, or PSM depending on substantive considerations - **VCE options**: All HC variants and cluster-robust - **Inference distribution**: t-distribution (RA) or normal (IPW/IPWRA/PSM) - **Key advantage**: Double robustness of IPWRA protects against misspecification **Inference Distribution by Estimator:** - **RA (Regression Adjustment)**: t-distribution with df = N - k - **IPW (Inverse Probability Weighting)**: Normal distribution (asymptotic) - **IPWRA (Doubly Robust)**: Normal distribution (asymptotic) - **PSM (Propensity Score Matching)**: Normal distribution (asymptotic) For detailed theoretical foundations, see :doc:`methodological_notes`. Estimator Selection (Large-Sample Common Timing) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For large cross-sectional samples in common timing settings, multiple estimators are available beyond regression adjustment. Lee and Wooldridge (2025) shows that the rolling transformation approach enables application of various treatment effects estimators, including doubly robust methods. **Available estimators for common timing:** - ``'ra'`` (default): Regression adjustment via OLS. Consistent when the outcome model is correctly specified. Efficient under correct specification. Uses t-distribution for inference. - ``'ipw'``: Inverse probability weighting. Consistent when the propensity score model is correctly specified. Requires ``controls`` parameter. Uses normal distribution for asymptotic inference. - ``'ipwra'``: Doubly robust estimator combining regression and IPW. Consistent when either the outcome model or propensity score is correctly specified. Recommended when functional form assumptions are uncertain. Uses normal distribution for asymptotic inference. - ``'psm'``: Propensity score matching. Matches treated to control units based on propensity scores. Requires adequate sample sizes for matching. Uses normal distribution for asymptotic inference. **Example (IPWRA in common timing):** .. code-block:: python results = lwdid( data, y='outcome', d='treated', ivar='unit', tvar='year', post='post', rolling='demean', estimator='ipwra', # Doubly robust estimator controls=['x1', 'x2'], # Controls for outcome and PS models vce='hc3' # Robust standard errors ) **Estimator selection guidelines:** 1. **Small samples (N < 50)**: Use RA with ``vce=None`` for exact t-based inference under CLM assumptions 2. **Moderate samples (50 ≤ N < 200)**: Start with RA and ``vce='hc3'``; use IPWRA as robustness check 3. **Large samples (N ≥ 200)**: Use IPWRA as primary estimator when functional form assumptions are uncertain 4. **All sample sizes**: Use randomization inference (``ri=True``) as a robustness check that does not rely on distributional assumptions Diagnostic Checks ~~~~~~~~~~~~~~~~~ 1. Visual inspection: Plot outcome trends for treated vs. control units 2. Pre-treatment balance: Check if treated and control units are similar pre-treatment 3. Parallel trends: Examine pre-treatment period-specific effects (should be near zero) 4. Sensitivity: Try different transformation methods and variance estimators Troubleshooting --------------- Common Errors ~~~~~~~~~~~~~ **"Insufficient pre-treatment periods"**: - ``demean``/``demeanq`` require :math:`T_0 \geq 1` - ``detrend``/``detrendq`` require :math:`T_0 \geq 2` - Solution: Check your ``post`` variable or use a different transformation **"Treatment timing is not common across units"**: - ``post`` must be the same for all units in each period - Solution: Verify your ``post`` variable is time-based, not unit-specific **"Control variable varies within unit"**: - Controls must be time-invariant - Solution: Use only time-invariant controls or create unit-level averages **"Duplicate (unit, time) observations"**: - Each (unit, time) combination must appear only once - Solution: Check for data duplication or aggregation issues Performance Tips ~~~~~~~~~~~~~~~~ - For large ``rireps`` (e.g., 10,000), RI can be slow - Use ``seed`` parameter for reproducible RI results - Export results to Excel/CSV for further analysis in other tools Staggered Adoption Design ------------------------- When units are treated at different times, use the staggered adoption framework. .. note:: **Staggered transformation options**: Only ``demean`` and ``detrend`` are supported for staggered adoption designs through the ``lwdid()`` interface. Seasonal transformations (``demeanq``, ``detrendq``) are restricted to common timing mode; passing them with ``gvar`` raises a ``ValueError``. Basic Usage ~~~~~~~~~~~ Instead of ``post`` (common timing), specify ``gvar`` (first treatment period): .. code-block:: python results = lwdid( data, y='outcome', ivar='unit', tvar='year', gvar='first_treat_year', # First treatment period for each unit rolling='demean', aggregate='overall', # Aggregate to overall effect ) Staggered-Specific Parameters ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **gvar** : str Column name indicating the first treatment period for each unit. Units with ``gvar = 0``, ``gvar = inf``, or ``gvar = NaN`` are never treated. **aggregate** : {'none', 'cohort', 'overall'} - ``'none'``: Return (g,r)-specific effects only - ``'cohort'``: Average effects within each cohort (default) - ``'overall'``: Weighted average across cohorts **control_group** : {'not_yet_treated', 'never_treated', 'all_others'} - ``'not_yet_treated'``: Use never-treated plus not-yet-treated units (default) - ``'never_treated'``: Use only never-treated units - ``'all_others'``: All units not in the treated cohort (including already-treated units). Primarily for replication and diagnostics; may introduce forbidden comparisons. **estimator** : {'ra', 'ipw', 'ipwra', 'psm'} - ``'ra'``: Regression adjustment (default) - ``'ipw'``: Inverse probability weighting - ``'ipwra'``: Doubly robust (recommended) - ``'psm'``: Propensity score matching **exclude_pre_periods** : int, default 0 Number of periods immediately before treatment to exclude from pre-treatment calculations. Used for robustness checks when the no-anticipation assumption may be violated. For cohort g, the pre-treatment mean/trend is computed using periods {T_min, ..., g-1-exclude_pre_periods} instead of {T_min, ..., g-1}. See Lee and Wooldridge (2025) for methodological details. Robustness to No-Anticipation Violations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When the no-anticipation assumption may be violated (e.g., units adjust behavior before formal treatment), the ``exclude_pre_periods`` parameter can be used to perform robustness checks. **Example**: .. code-block:: python # Baseline estimation result_baseline = lwdid( data, y='outcome', ivar='unit', tvar='year', gvar='cohort', rolling='demean' ) # Robustness check: exclude 1 period before treatment result_robust1 = lwdid( data, y='outcome', ivar='unit', tvar='year', gvar='cohort', rolling='demean', exclude_pre_periods=1 ) # Robustness check: exclude 2 periods before treatment result_robust2 = lwdid( data, y='outcome', ivar='unit', tvar='year', gvar='cohort', rolling='demean', exclude_pre_periods=2 ) # Compare results print(f"Baseline ATT: {result_baseline.att_overall:.4f}") print(f"Exclude 1 period: {result_robust1.att_overall:.4f}") print(f"Exclude 2 periods: {result_robust2.att_overall:.4f}") **Minimum pre-treatment period requirements**: - ``demean``: At least 1 pre-treatment period remaining after exclusion - ``detrend``: At least 2 pre-treatment periods remaining after exclusion If any cohort has insufficient pre-treatment periods after applying ``exclude_pre_periods``, an ``InsufficientPrePeriodsError`` is raised. Accessing Staggered Results ~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Overall weighted effect results.att_overall # ATT estimate results.se_overall # Standard error # Cohort-specific effects df_cohorts = results.att_by_cohort # Cohort-time specific effects df_gt = results.att_by_cohort_time # Event study visualization fig, ax = results.plot_event_study() Control Group Strategies ~~~~~~~~~~~~~~~~~~~~~~~~ **Not-yet-treated (default)**: Uses both never-treated units and units that will be treated in future periods. More efficient but requires stronger no-anticipation assumption. .. code-block:: python results = lwdid(..., control_group='not_yet_treated') **Never-treated only**: Uses only units that are never treated during the observation period. More robust but potentially less efficient. .. code-block:: python results = lwdid(..., control_group='never_treated') Estimator Selection ~~~~~~~~~~~~~~~~~~~ **Regression Adjustment (RA)**: - Consistent when conditional mean is correctly specified - Default estimator, efficient under correct specification **Doubly Robust (IPWRA)**: - Consistent if either outcome model or propensity score is correct - Recommended when functional form assumptions are uncertain .. code-block:: python results = lwdid(..., estimator='ipwra') **Inverse Probability Weighting (IPW)**: - Weights by propensity score - Consistent when propensity score is correctly specified **Propensity Score Matching (PSM)**: - Matches treated to control units by propensity score - Use when sample sizes are adequate for matching Staggered Example ~~~~~~~~~~~~~~~~~ .. code-block:: python import pandas as pd from lwdid import lwdid # Load castle law data data = pd.read_csv('castle.csv') # Estimate with staggered design results = lwdid( data, y='lhomicide', ivar='state', tvar='year', gvar='effyear', rolling='demean', aggregate='overall', control_group='not_yet_treated', estimator='ipwra', ) # Display results print(results.summary()) # Event study plot fig, ax = results.plot_event_study( include_pre_treatment=True, show_ci=True ) Pre-treatment Dynamics and Parallel Trends Testing -------------------------------------------------- Pre-treatment dynamics analysis enables assessment of the parallel trends assumption validity by estimating treatment effects in pre-treatment periods. Under the null hypothesis of parallel trends, these pre-treatment effects should be zero. This feature implements the methodology from Lee and Wooldridge (2025), using rolling transformations that look forward to future pre-treatment periods rather than backward. Why Pre-treatment Dynamics Matter ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The parallel trends assumption is fundamental to difference-in-differences identification. While this assumption cannot be directly tested (it concerns counterfactual outcomes), examining pre-treatment dynamics provides indirect evidence: - **Under parallel trends**: Pre-treatment ATT estimates should be approximately zero (within sampling variation) - **Violation of parallel trends**: Systematic non-zero pre-treatment effects suggest the control group may not provide a valid counterfactual The anchor point convention sets the effect at event time e = -1 (the period immediately before treatment) to exactly zero, providing a reference point for interpreting other pre-treatment effects. Basic Usage ~~~~~~~~~~~ Enable pre-treatment dynamics by setting ``include_pretreatment=True``: .. code-block:: python from lwdid import lwdid results = lwdid( data, y='outcome', ivar='unit', tvar='year', gvar='first_treat', rolling='demean', aggregate='cohort', include_pretreatment=True, # Enable pre-treatment estimation pretreatment_test=True, # Run parallel trends test (default) pretreatment_alpha=0.05, # Significance level (default) ) # View complete summary including pre-treatment results print(results.summary()) Accessing Pre-treatment Results ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Pre-treatment ATT estimates as DataFrame pre_df = results.att_pre_treatment print(pre_df) # Columns: cohort, period, event_time, att, se, ci_lower, ci_upper, # t_stat, pvalue, n_treated, n_control, is_anchor, rolling_window_size # Parallel trends test results pt = results.parallel_trends_test print(f"Joint F-statistic: {pt.joint_f_stat:.4f}") print(f"Joint p-value: {pt.joint_pvalue:.4f}") print(f"Reject parallel trends: {pt.reject_null}") print(f"Number of pre-treatment periods tested: {pt.n_periods}") Interpreting the Parallel Trends Test ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The parallel trends test performs a joint F-test of the null hypothesis that all pre-treatment ATT estimates (excluding the anchor point) are jointly zero: - **H0**: All pre-treatment ATT = 0 (parallel trends holds) - **H1**: At least one pre-treatment ATT ≠ 0 (parallel trends violated) **Interpretation guidelines**: - ``reject_null=False``: No statistical evidence against parallel trends. This does not prove parallel trends holds, but the data are consistent with it. - ``reject_null=True``: Evidence of pre-treatment differences. Consider: - Using ``detrend`` instead of ``demean`` to allow for heterogeneous trends - Examining which cohorts or periods show violations - Investigating potential anticipation effects or data issues Event Study Visualization with Pre-treatment ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The event study plot can display both pre-treatment and post-treatment effects: .. code-block:: python # Plot with pre-treatment effects fig, ax = results.plot_event_study( include_pre_treatment=True, # Show pre-treatment effects pre_treatment_color='gray', # Color for pre-treatment points post_treatment_color='blue', # Color for post-treatment points show_anchor_line=True, # Vertical line at e=-1 title='Event Study with Pre-treatment Dynamics', ylabel='Treatment Effect', ) fig.savefig('event_study_pretreatment.png', dpi=300) The plot shows: - Pre-treatment effects (e < -1) in gray with confidence intervals - Anchor point (e = -1) at zero - Post-treatment effects (e ≥ 0) in blue with confidence intervals - Optional vertical dashed line at the anchor point Complete Workflow Example ~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python import pandas as pd from lwdid import lwdid # Load data data = pd.read_csv('castle.csv') # Step 1: Estimate with pre-treatment dynamics results = lwdid( data, y='lhomicide', ivar='state', tvar='year', gvar='effyear', rolling='demean', aggregate='cohort', include_pretreatment=True, pretreatment_test=True, ) # Step 2: Check parallel trends test pt = results.parallel_trends_test if pt.reject_null: print("Warning: Evidence against parallel trends!") print(f"F-stat: {pt.joint_f_stat:.3f}, p-value: {pt.joint_pvalue:.4f}") print("Consider using rolling='detrend' to allow heterogeneous trends") else: print("No evidence against parallel trends") print(f"F-stat: {pt.joint_f_stat:.3f}, p-value: {pt.joint_pvalue:.4f}") # Step 3: Examine pre-treatment effects pre_df = results.att_pre_treatment non_anchor = pre_df[~pre_df['is_anchor']] print("\nPre-treatment effects (excluding anchor):") print(non_anchor[['event_time', 'att', 'se', 'pvalue']].to_string(index=False)) # Step 4: Visualize fig, ax = results.plot_event_study( include_pre_treatment=True, title='Castle Doctrine: Event Study with Pre-treatment', ) fig.savefig('castle_event_study.png', dpi=300, bbox_inches='tight') # Step 5: Report main results print(f"\nOverall ATT: {results.att_overall:.4f} (SE: {results.se_overall:.4f})") Transformation Methods for Pre-treatment ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Pre-treatment dynamics support all four transformation methods (``demean``, ``detrend``, ``demeanq``, ``detrendq``): **Demean (default)**: Uses rolling demeaning with future pre-treatment periods. For period t < g: .. math:: \dot{Y}_{itg} = Y_{it} - \frac{1}{g-1-t} \sum_{q=t+1}^{g-1} Y_{iq} **Detrend**: Uses rolling OLS detrending with future pre-treatment periods. Fits a linear trend using periods {t+1, ..., g-1} and computes the residual. The ``detrend`` option is appropriate when cohorts are suspected to have different pre-treatment trends. Trend Diagnostics Tools ----------------------- For exploratory analysis before running estimation, ``lwdid`` provides standalone diagnostic functions in the ``trend_diagnostics`` module. These tools help assess whether parallel trends holds and recommend an appropriate transformation method. Testing Parallel Trends with Standalone Function ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python import lwdid # Test parallel trends using placebo method pt_result = lwdid.test_parallel_trends( data=df, y='outcome', ivar='unit_id', tvar='year', gvar='first_treat', method='placebo', # or 'regression', 'visual' alpha=0.05 ) # Display summary print(pt_result.summary()) # Access individual results if pt_result.reject_null: print(f"Parallel trends rejected (p={pt_result.pvalue:.4f})") print(f"Recommendation: {pt_result.recommendation}") else: print("No evidence against parallel trends") Diagnosing Heterogeneous Trends ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Diagnose trend heterogeneity across cohorts ht_diag = lwdid.diagnose_heterogeneous_trends( data=df, y='outcome', ivar='unit_id', tvar='year', gvar='first_treat' ) # Display summary print(ht_diag.summary()) # Check for heterogeneous trends if ht_diag.has_heterogeneous_trends: print("Heterogeneous trends detected") print(f"Recommendation: {ht_diag.recommendation}") Getting Automated Transformation Recommendation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Get data-driven recommendation rec = lwdid.recommend_transformation( data=df, y='outcome', ivar='unit_id', tvar='year', gvar='first_treat', verbose=True ) print(f"Recommended method: {rec.recommended_method}") print(f"Confidence: {rec.confidence:.1%}") for reason in rec.reasons: print(f" - {reason}") # Use the recommendation results = lwdid.lwdid( data=df, y='outcome', ivar='unit_id', tvar='year', gvar='first_treat', rolling=rec.recommended_method, ) Visualizing Cohort Trends ~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Plot outcome trajectories by cohort fig = lwdid.plot_cohort_trends( data=df, y='outcome', ivar='unit_id', tvar='year', gvar='first_treat', normalize=True # Normalize to first period ) fig.savefig('cohort_trends.png', dpi=300) See :doc:`api/trend_diagnostics` for complete API documentation. Further Reading --------------- - :doc:`quickstart` - Quick start tutorial - :doc:`methodological_notes` - Theoretical background - :doc:`api/index` - Complete API reference - :doc:`api/staggered` - Staggered module API - :doc:`examples/index` - Usage examples