User Guide
This guide covers all aspects of using the lwdid package for difference-in-differences
estimation with rolling transformations.
Overview
The lwdid package implements the Lee and Wooldridge methods for difference-in-
differences estimation with panel data, covering three main scenarios:
Small-Sample Common Timing (Lee and Wooldridge 2026):
Exact t-based inference under CLM assumptions (normality and homoskedasticity)
Designed for settings with small numbers of treated or control units
Works best with large time dimensions
Large-Sample Common Timing (Lee and Wooldridge 2025):
Asymptotic inference with robust standard errors
Supports heteroskedasticity-robust (HC0-HC4) and cluster-robust options
Multiple estimators: RA, IPW, IPWRA, PSM
Staggered Adoption (Lee and Wooldridge 2025):
Units treated at different times
Cohort-time specific effect estimation
Flexible control group strategies and aggregation options
Key features:
Four transformation methods: demean, detrend, demeanq, detrendq
Multiple variance estimators for different assumptions
Randomization inference for small samples
Event study visualization for staggered designs
Scenario Selection Guide
Use this guide to choose the appropriate scenario for your analysis:
Small-sample common timing: Use when N is small with common treatment timing. Key parameters:
post,vce=None.Large-sample common timing: Use when N is larger with common treatment timing. Key parameters:
post,vce='hc3'.Staggered adoption: Use when units are treated at different times. Key parameters:
gvar,aggregate.
Decision Flowchart:
┌─────────────────────────────────┐
│ Treatment timing varies │
│ across units? │
└───────────────┬─────────────────┘
│
┌───────────────┴───────────────┐
│ │
▼ ▼
┌────────┐ ┌─────────┐
│ YES │ │ NO │
└────┬───┘ └────┬────┘
│ │
▼ ▼
┌───────────────────────────┐ ┌─────────────────────────────┐
│ STAGGERED ADOPTION │ │ Is cross-sectional N │
│ Use: gvar parameter │ │ small (< 50)? │
│ Estimator: ra/ipwra │ └───────────────┬─────────────┘
│ Aggregate: cohort/overall│ │
└───────────────────────────┘ ┌───────────────┴───────────────┐
│ │
▼ ▼
┌────────┐ ┌─────────┐
│ YES │ │ NO │
└────┬───┘ └────┬────┘
│ │
▼ ▼
┌─────────────────────────────────────┐ ┌─────────────────────────────────┐
│ SMALL-SAMPLE COMMON TIMING │ │ LARGE-SAMPLE COMMON TIMING │
│ Use: post parameter │ │ Use: post parameter │
│ VCE: None (exact t-inference) │ │ VCE: hc3 or robust │
│ Alt: ri=True for RI │ │ Estimator: ra/ipwra │
│ Reference: Lee and Wooldridge │ │ Reference: Lee and Wooldridge │
│ (2026) │ │ (2025) │
└─────────────────────────────────────┘ └─────────────────────────────────┘
Quick Reference
For a quick start, see Quick Start. For complete API details, see API Reference.
Transformation Methods
Four transformation methods convert panel data into cross-sectional form. The choice depends on data structure and assumptions.
demean: Standard DiD with Unit Fixed Effects
When to use:
Annual data or data without seasonal patterns
Control for time-invariant unit characteristics
At least 1 pre-treatment period (\(T_0 \geq 1\))
What it does:
Removes unit-specific pre-treatment means from each observation.
Mathematical form (Lee and Wooldridge 2026):
For each unit \(i\), compute the pre-treatment mean \(\bar{Y}_{i,pre}\) and transform:
Then estimate a cross-sectional regression on the transformed data.
Example:
results = lwdid(
data, y='outcome', d='treated', ivar='unit', tvar='year',
post='post', rolling='demean'
)
detrend: DiD with Unit-Specific Linear Trends
When to use:
Units exhibit different linear trends in the pre-treatment period
Control for both unit fixed effects and unit-specific trends
At least 2 pre-treatment periods (\(T_0 \geq 2\))
What it does:
Estimates and removes unit-specific linear trends from the pre-treatment data.
Mathematical form (Lee and Wooldridge 2026):
For each unit \(i\), estimate linear trend from pre-treatment data:
Then detrend all observations using the estimated trend.
Example:
results = lwdid(
data, y='outcome', d='treated', ivar='unit', tvar='year',
post='post', rolling='detrend'
)
demeanq: Seasonal Data with Seasonal Fixed Effects
When to use:
Periodic data with seasonal patterns (quarterly, monthly, or weekly)
Control for both unit and season-of-year effects
Pre-treatment sample rich enough to estimate seasonal fixed effects for each unit: for each unit, the number of pre-treatment observations must be at least Q + 1 (where Q is the number of seasons per year)
For each unit, every season that appears in the post-treatment period must also appear in its pre-treatment period (season-coverage condition).
What it does:
Removes unit-specific pre-treatment means and season-of-year effects.
Supported seasonal periods (Q parameter):
Q=4 (default): Quarterly data (4 seasons per year)
Q=12: Monthly data (12 seasons per year)
Q=52: Weekly data (52 seasons per year)
Data requirements:
For quarterly data: Pass
tvaras a list:tvar=['year', 'quarter']For monthly/weekly data: Use a single time variable with
Qandseason_varparameters
Examples:
Quarterly data (Q=4, default):
results = lwdid(
data_q, y='sales', d='treated', ivar='store',
tvar=['year', 'quarter'], # Composite time variable
post='post', rolling='demeanq'
)
Monthly data (Q=12):
results = lwdid(
data_m, y='sales', d='treated', ivar='store',
tvar='time', # Single time index
post='post', rolling='demeanq',
Q=12, # 12 seasons per year
season_var='month' # Month indicator (1-12)
)
Weekly data (Q=52):
results = lwdid(
data_w, y='sales', d='treated', ivar='store',
tvar='time', # Single time index
post='post', rolling='demeanq',
Q=52, # 52 seasons per year
season_var='week' # Week indicator (1-52)
)
detrendq: Seasonal Data with Trends and Seasonal Effects
When to use:
Periodic data with both trends and seasonal patterns
Control for unit trends and season-of-year effects
At least Q + 2 pre-treatment observations per unit (where Q is the number of seasons per year)
For each unit, every season that appears in the post-treatment period must also appear in its pre-treatment period (season-coverage condition).
What it does:
Combines detrending with seasonal adjustment.
Supported seasonal periods (Q parameter):
Q=4 (default): Quarterly data (4 seasons per year)
Q=12: Monthly data (12 seasons per year)
Q=52: Weekly data (52 seasons per year)
Examples:
Quarterly data (Q=4, default):
results = lwdid(
data_q, y='sales', d='treated', ivar='store',
tvar=['year', 'quarter'],
post='post', rolling='detrendq'
)
Monthly data (Q=12):
results = lwdid(
data_m, y='sales', d='treated', ivar='store',
tvar='time',
post='post', rolling='detrendq',
Q=12,
season_var='month'
)
Weekly data (Q=52):
results = lwdid(
data_w, y='sales', d='treated', ivar='store',
tvar='time',
post='post', rolling='detrendq',
Q=52,
season_var='week'
)
Automatic Frequency Detection
For convenience, lwdid can automatically detect the data frequency and set
the appropriate Q parameter:
results = lwdid(
data, y='sales', d='treated', ivar='store',
tvar='time',
post='post', rolling='demeanq',
auto_detect_frequency=True # Automatically detect Q
)
When auto_detect_frequency=True, the package examines the time variable to
determine the most likely data frequency:
Detects monthly data (Q=12) based on 12 observations per year pattern
Detects weekly data (Q=52) based on 52 observations per year pattern
Falls back to quarterly (Q=4) if frequency cannot be determined
Note: If both an explicit Q (other than the default 4) and
auto_detect_frequency=True are specified, and the detected frequency
differs from the explicit Q, a warning is issued and the explicit Q
value takes precedence. No error is raised.
Variance Estimation
The package supports multiple variance estimators for different assumptions about the error structure.
OLS (Homoskedastic)
When to use:
Errors are homoskedastic and normally distributed
Exact t-based inference is desired
Specification:
results = lwdid(..., vce=None) # Default
Degrees of freedom:
Non-clustered: df equals the residual degrees of freedom from the cross-sectional regression (df_resid). In a simple specification without controls this is N - 2 (intercept and treatment indicator); with controls and interactions it is N minus the total number of estimated parameters.
Clustered: df = G - 1 (G clusters)
HC0 (White’s Heteroskedasticity-Consistent)
When to use:
Heteroskedasticity is suspected
Sample size is large
Specification:
results = lwdid(..., vce='hc0')
Note: HC0 is the original heteroskedasticity-consistent estimator without finite-sample adjustments.
HC1 (Heteroskedasticity-Robust)
When to use:
Heteroskedasticity is suspected
Sample size is moderate to large, so asymptotic approximations are more reliable
Specification:
results = lwdid(..., vce='hc1')
# or equivalently:
results = lwdid(..., vce='robust')
Note: 'hc1' and 'robust' are aliases for the same estimator. HC1 applies
a degrees-of-freedom correction (n/(n-k)) to HC0.
HC2 (Leverage-Adjusted)
When to use:
Heteroskedasticity is suspected
Sample size is small to moderate
Observations have varying leverage
Specification:
results = lwdid(..., vce='hc2')
Note: HC2 divides squared residuals by (1 - h_ii) where h_ii is the diagonal of the hat matrix. Provides better performance than HC1 when leverage varies across observations.
HC3 (Small-Sample Adjusted Robust)
When to use:
Heteroskedasticity is suspected
Sample size is small or moderate
Specification:
results = lwdid(..., vce='hc3')
Advantage: HC3 divides squared residuals by (1 - h_ii)^2, providing better finite-sample performance than HC1 and HC2 in small samples.
HC4 (High-Leverage Adjusted)
When to use:
Heteroskedasticity is suspected
Data contains high-leverage observations
Specification:
results = lwdid(..., vce='hc4')
Note: HC4 uses an adaptive exponent that increases the adjustment for observations with high leverage. Recommended when the design matrix contains potentially influential observations.
Cluster-Robust
When to use:
Errors are correlated within clusters (e.g., states, regions)
Multiple units per cluster
Specification:
results = lwdid(..., vce='cluster', cluster_var='state')
Degrees of freedom: G - 1, where G is the number of clusters.
Clustering Level Selection: Lee and Wooldridge (2025) provides guidance on choosing the clustering level. The clustering level should match the level at which the policy varies:
If the unit of observation is county but policy varies at the state level, cluster at the state level
If the unit of observation is individual but policy varies at the firm level, cluster at the firm level
In general, cluster at the highest level where treatment assignment is determined. This ensures that standard errors properly account for the correlation structure induced by the policy variation.
Randomization Inference
Randomization inference provides an alternative to t-based inference without distributional assumptions.
Basic Usage
results = lwdid(
data, y='outcome', d='treated', ivar='unit', tvar='year',
post='post', rolling='demean',
ri=True, # Enable RI
rireps=1000, # Number of randomization replications (default: 1000)
seed=42 # Random seed for reproducibility
)
print(f"RI p-value: {results.ri_pvalue:.4f}")
RI Methods
The package supports two RI methods:
Bootstrap (default):
results = lwdid(..., ri=True, ri_method='bootstrap')
With-replacement resampling.
Permutation:
results = lwdid(..., ri=True, ri_method='permutation')
Fisher randomization inference without replacement.
Accessing RI Results
results.ri_pvalue # RI p-value (available if ri=True)
results.ri_method # RI method ('permutation' or 'bootstrap')
results.rireps # Number of RI replications
results.ri_valid # Number of valid RI replications
results.ri_failed # Number of failed RI replications
results.ri_seed # Random seed actually used for RI (auto-generated if not provided)
Control Variables
Time-invariant control variables can be included in the estimation.
Basic Usage
results = lwdid(
data, y='outcome', d='treated', ivar='unit', tvar='year',
post='post', rolling='demean',
controls=['population', 'income', 'education']
)
Requirements
Controls must be time-invariant (constant within each unit)
Controls are automatically centered (demeaned)
Missing values in controls are handled at the estimation stage: observations with missing controls may be dropped if doing so still leaves enough treated and control units to satisfy the N1>K+1 and N0>K+1 conditions; otherwise controls are omitted and the full regression sample is retained (with a warning)
Effect on Inference
Including controls:
Reduces residual variance (potentially increasing power)
Reduces degrees of freedom by the number of added regressors (controls and their interactions)
Controls are centered before inclusion in the cross-sectional regression
Controls are included only if both the treated and control groups satisfy N1>K+1 and N0>K+1 (where K is the number of controls); otherwise, controls are automatically excluded and a warning is issued
Working with Results
The lwdid() function returns a LWDIDResults object with comprehensive
estimation results and export capabilities.
Core Estimates
results.att # Average treatment effect on treated
results.se_att # Standard error
results.t_stat # t-statistic
results.pvalue # Two-sided p-value
results.ci_lower # 95% CI lower bound
results.ci_upper # 95% CI upper bound
results.df_inference # Degrees of freedom
Period-Specific Effects
# Access period-by-period estimates
df_periods = results.att_by_period
# DataFrame with columns:
# - period: Time period label
# - tindex: Numeric time index
# - beta: Treatment effect estimate
# - se: Standard error
# - ci_lower, ci_upper: 95% confidence interval
# - tstat: t-statistic
# - pval: p-value
# - N: Sample size
# The first row summarizes the overall average effect ('period' == 'average').
Sample Information
results.nobs # Total observations in regression
results.n_treated # Number of treated units
results.n_control # Number of control units
results.K # Number of pre-treatment periods
results.tpost1 # First post-treatment period index
Visualization
# Plot treatment effects over time
results.plot()
# Customize the plot
results.plot(graph_options={'figsize': (10, 6), 'title': 'Treatment Effects Over Time', 'ylabel': 'Effect Size'})
Export Results
Excel:
results.to_excel('results.xlsx')
# Creates sheets: Summary, ByPeriod, and RI (if available)
CSV:
results.to_csv('results.csv')
# Exports period-by-period estimates
LaTeX:
results.to_latex('table.tex')
# Creates publication-ready LaTeX table
Print Summary
results.summary() # Prints formatted summary to console
Data Requirements and Validation
Panel Structure
Required format:
Long format: one row per (unit, time) observation
Unique (unit, time) combinations
Continuous time sequence (no gaps in time index)
Validation:
The package automatically validates:
No duplicate (unit, time) pairs
Time index forms continuous sequence
Sufficient pre-treatment periods for chosen transformation
Treatment Structure
Common timing assumption:
All treated units must start treatment in the same period. The post
variable must be a function of time only.
Treatment persistence:
Once post switches from 0 to 1, it must remain 1 for all subsequent
periods (no reversals).
Validation:
The package checks:
postis binary (0 or 1)postis time-invariant across units in each periodNo treatment reversals
Treatment Indicator
The d parameter must be a time-invariant treatment group indicator:
d = 1for all periods if unit is in treatment groupd = 0for all periods if unit is in control groupdshould NOT vary over time
Common mistake: Passing time-varying treatment status instead of group indicator.
Missing Data
Allowed:
Missing values in outcome variable
y(observations dropped)Unbalanced panels (different units can have different numbers of periods)
Not allowed:
Rows with missing values in
d,ivar,tvar, orpost(these observations are dropped during validation)Gaps in time sequence
Handling Unbalanced Panels
When the panel is unbalanced (different units observed in different periods), the
balanced_panel parameter controls behavior:
balanced_panel='warn'(default): Issue a warning and proceed with estimationbalanced_panel='error': Raise an exception; require a balanced panelbalanced_panel='ignore': Silently proceed without any warning
# Default: warn about unbalanced panel but proceed
results = lwdid(data, y='outcome', ..., balanced_panel='warn')
# Strict: require balanced panel
results = lwdid(data, y='outcome', ..., balanced_panel='error')
# Silent: ignore panel imbalance
results = lwdid(data, y='outcome', ..., balanced_panel='ignore')
Selection mechanism assumption: With unbalanced panels, selection into the sample may depend on time-invariant characteristics (removed by the transformation), but should not depend on outcome shocks. See Methodological Notes for details.
Tip: Use detrend instead of demean for additional robustness to selection
bias in unbalanced panels, as detrending removes both level and trend heterogeneity.
For diagnosing potential selection issues, use the selection diagnostics tools:
from lwdid import diagnose_selection_mechanism
diag = diagnose_selection_mechanism(data, ivar='unit', tvar='year')
print(diag.summary())
See Selection Diagnostics Module (selection_diagnostics) for detailed documentation.
Never-Treated Unit Data Preparation
In staggered adoption designs, never-treated units serve as the control group. Proper encoding of never-treated units is essential for correct estimation.
Valid Never-Treated Encodings
The gvar column can encode never-treated units using any of these values:
Zero (0): Common in Stata and other software
data['gvar'] = data['gvar'].fillna(0) # Convert NaN to 0
Positive infinity (np.inf): Represents “treated at infinity”
import numpy as np data.loc[data['never_treated'], 'gvar'] = np.inf
NaN/NA/None: Missing treatment time indicates never-treated
# NaN is automatically recognized as never-treated data['gvar'] = data['first_treat_year'] # NaN for never-treated
All three encodings are equivalent and produce identical results.
Converting from Stata Data
When importing data from Stata, never-treated units may be encoded differently:
import pandas as pd
import numpy as np
# Load Stata data
data = pd.read_stata('staggered_data.dta')
# Common Stata encodings for never-treated:
# - Missing values (.) → automatically become NaN in pandas
# - Zero (0) → recognized as never-treated
# - Large values (9999) → need manual conversion
# Convert custom encoding to standard format
data['gvar'] = data['gvar'].replace({9999: 0, -1: 0})
# Verify never-treated identification
from lwdid.validation import is_never_treated
unit_gvar = data.groupby('unit')['gvar'].first()
n_nt = unit_gvar.apply(is_never_treated).sum()
print(f"Never-treated units: {n_nt}")
Checking Never-Treated Status
Use the is_never_treated() function to verify encoding:
from lwdid.validation import is_never_treated
import numpy as np
# Check individual values
print(is_never_treated(0)) # True
print(is_never_treated(np.inf)) # True
print(is_never_treated(np.nan)) # True
print(is_never_treated(2005)) # False
# Check DataFrame
unit_gvar = data.groupby('unit')['gvar'].first()
nt_mask = unit_gvar.apply(is_never_treated)
print(f"Total units: {len(unit_gvar)}")
print(f"Never-treated: {nt_mask.sum()}")
print(f"Treated: {(~nt_mask).sum()}")
Common Issues and Solutions
Issue: No never-treated units found
# Check gvar values
print(data['gvar'].unique())
# If using custom encoding (e.g., 9999 for never-treated)
data['gvar'] = data['gvar'].replace({9999: 0})
Issue: Negative gvar values
# Negative values are invalid
if (data['gvar'] < 0).any():
print("Warning: Negative gvar values found")
# Convert to valid encoding
data.loc[data['gvar'] < 0, 'gvar'] = 0
Issue: gvar varies within unit
# gvar must be time-invariant
gvar_varies = data.groupby('unit')['gvar'].nunique() > 1
if gvar_varies.any():
print(f"Units with varying gvar: {gvar_varies.sum()}")
# Use first observation's gvar
first_gvar = data.groupby('unit')['gvar'].first()
data['gvar'] = data['unit'].map(first_gvar)
Best Practices
Choosing Transformation Method
Start with ``demean`` for annual data
Use ``detrend`` if pre-treatment trends differ across units
Use ``demeanq``/``detrendq`` for quarterly data with seasonality
Check pre-treatment period requirements (\(T_0 \geq 1\) for demean, \(T_0 \geq 2\) for detrend)
Choosing Variance Estimator
Default (``vce=None``): Appropriate when homoskedasticity is plausible
``vce=’hc3’``: Recommended for small to moderate sample sizes when heteroskedasticity is suspected
``vce=’cluster’``: Appropriate when errors are clustered
Randomization inference: Use as robustness check or when normality is doubtful
Sample Size Considerations
Small samples (N < 50): Use exact t-based inference (
vce=None) under CLM assumptions (normality and homoskedasticity). Minimum requirement is \(N \geq 3\) units. Having at least 10 units improves stability.Moderate to large samples (N >= 50): Use heteroskedasticity-robust standard errors (
vce='hc3') for valid asymptotic inference.Large samples with controls: Consider IPWRA (
estimator='ipwra') for doubly robust estimation when functional form assumptions are uncertain.Time dimension: Method works best with large T (many time periods), where the central limit theorem across time supports normality of the transformed outcome.
Asymptotic theory: Lee and Wooldridge (2025) develops the theoretical foundation for large-sample inference, supporting HC0-HC4 and cluster-robust standard errors.
Sample Size and Inference Method Selection Matrix
The following matrix provides guidance on selecting the appropriate estimator and inference method based on sample size:
Small Samples (N < 50):
Recommended: RA with
vce=None(exact t-based inference)Alternative: RA with
ri=True(randomization inference)VCE options:
vce=Noneorvce='hc3'Inference distribution: t-distribution
Key assumption: CLM assumptions (normality, homoskedasticity) for exact inference
Moderate Samples (50 ≤ N < 200):
Recommended: RA or IPWRA with
vce='hc3'VCE options:
vce='hc1',vce='hc3', orvce='cluster'Inference distribution: t-distribution (RA) or normal (IPW/IPWRA/PSM)
Key consideration: HC3 provides better finite-sample performance
Large Samples (N ≥ 200):
Recommended: IPWRA with
vce='hc1'orvce='robust'Alternative: RA, IPW, or PSM depending on substantive considerations
VCE options: All HC variants and cluster-robust
Inference distribution: t-distribution (RA) or normal (IPW/IPWRA/PSM)
Key advantage: Double robustness of IPWRA protects against misspecification
Inference Distribution by Estimator:
RA (Regression Adjustment): t-distribution with df = N - k
IPW (Inverse Probability Weighting): Normal distribution (asymptotic)
IPWRA (Doubly Robust): Normal distribution (asymptotic)
PSM (Propensity Score Matching): Normal distribution (asymptotic)
For detailed theoretical foundations, see Methodological Notes.
Estimator Selection (Large-Sample Common Timing)
For large cross-sectional samples in common timing settings, multiple estimators are available beyond regression adjustment. Lee and Wooldridge (2025) shows that the rolling transformation approach enables application of various treatment effects estimators, including doubly robust methods.
Available estimators for common timing:
'ra'(default): Regression adjustment via OLS. Consistent when the outcome model is correctly specified. Efficient under correct specification. Uses t-distribution for inference.'ipw': Inverse probability weighting. Consistent when the propensity score model is correctly specified. Requirescontrolsparameter. Uses normal distribution for asymptotic inference.'ipwra': Doubly robust estimator combining regression and IPW. Consistent when either the outcome model or propensity score is correctly specified. Recommended when functional form assumptions are uncertain. Uses normal distribution for asymptotic inference.'psm': Propensity score matching. Matches treated to control units based on propensity scores. Requires adequate sample sizes for matching. Uses normal distribution for asymptotic inference.
Example (IPWRA in common timing):
results = lwdid(
data, y='outcome', d='treated', ivar='unit', tvar='year',
post='post', rolling='demean',
estimator='ipwra', # Doubly robust estimator
controls=['x1', 'x2'], # Controls for outcome and PS models
vce='hc3' # Robust standard errors
)
Estimator selection guidelines:
Small samples (N < 50): Use RA with
vce=Nonefor exact t-based inference under CLM assumptionsModerate samples (50 ≤ N < 200): Start with RA and
vce='hc3'; use IPWRA as robustness checkLarge samples (N ≥ 200): Use IPWRA as primary estimator when functional form assumptions are uncertain
All sample sizes: Use randomization inference (
ri=True) as a robustness check that does not rely on distributional assumptions
Diagnostic Checks
Visual inspection: Plot outcome trends for treated vs. control units
Pre-treatment balance: Check if treated and control units are similar pre-treatment
Parallel trends: Examine pre-treatment period-specific effects (should be near zero)
Sensitivity: Try different transformation methods and variance estimators
Troubleshooting
Common Errors
“Insufficient pre-treatment periods”:
demean/demeanqrequire \(T_0 \geq 1\)detrend/detrendqrequire \(T_0 \geq 2\)Solution: Check your
postvariable or use a different transformation
“Treatment timing is not common across units”:
postmust be the same for all units in each periodSolution: Verify your
postvariable is time-based, not unit-specific
“Control variable varies within unit”:
Controls must be time-invariant
Solution: Use only time-invariant controls or create unit-level averages
“Duplicate (unit, time) observations”:
Each (unit, time) combination must appear only once
Solution: Check for data duplication or aggregation issues
Performance Tips
For large
rireps(e.g., 10,000), RI can be slowUse
seedparameter for reproducible RI resultsExport results to Excel/CSV for further analysis in other tools
Staggered Adoption Design
When units are treated at different times, use the staggered adoption framework.
Note
Staggered transformation options: Only demean and detrend
are supported for staggered adoption designs through the lwdid()
interface. Seasonal transformations (demeanq, detrendq) are
restricted to common timing mode; passing them with gvar raises
a ValueError.
Basic Usage
Instead of post (common timing), specify gvar (first treatment period):
results = lwdid(
data,
y='outcome',
ivar='unit',
tvar='year',
gvar='first_treat_year', # First treatment period for each unit
rolling='demean',
aggregate='overall', # Aggregate to overall effect
)
Staggered-Specific Parameters
- gvarstr
Column name indicating the first treatment period for each unit. Units with
gvar = 0,gvar = inf, orgvar = NaNare never treated.- aggregate{‘none’, ‘cohort’, ‘overall’}
'none': Return (g,r)-specific effects only'cohort': Average effects within each cohort (default)'overall': Weighted average across cohorts
- control_group{‘not_yet_treated’, ‘never_treated’, ‘all_others’}
'not_yet_treated': Use never-treated plus not-yet-treated units (default)'never_treated': Use only never-treated units'all_others': All units not in the treated cohort (including already-treated units). Primarily for replication and diagnostics; may introduce forbidden comparisons.
- estimator{‘ra’, ‘ipw’, ‘ipwra’, ‘psm’}
'ra': Regression adjustment (default)'ipw': Inverse probability weighting'ipwra': Doubly robust (recommended)'psm': Propensity score matching
- exclude_pre_periodsint, default 0
Number of periods immediately before treatment to exclude from pre-treatment calculations. Used for robustness checks when the no-anticipation assumption may be violated.
For cohort g, the pre-treatment mean/trend is computed using periods {T_min, …, g-1-exclude_pre_periods} instead of {T_min, …, g-1}.
See Lee and Wooldridge (2025) for methodological details.
Robustness to No-Anticipation Violations
When the no-anticipation assumption may be violated (e.g., units adjust behavior
before formal treatment), the exclude_pre_periods parameter can be used to
perform robustness checks.
Example:
# Baseline estimation
result_baseline = lwdid(
data, y='outcome', ivar='unit', tvar='year',
gvar='cohort', rolling='demean'
)
# Robustness check: exclude 1 period before treatment
result_robust1 = lwdid(
data, y='outcome', ivar='unit', tvar='year',
gvar='cohort', rolling='demean',
exclude_pre_periods=1
)
# Robustness check: exclude 2 periods before treatment
result_robust2 = lwdid(
data, y='outcome', ivar='unit', tvar='year',
gvar='cohort', rolling='demean',
exclude_pre_periods=2
)
# Compare results
print(f"Baseline ATT: {result_baseline.att_overall:.4f}")
print(f"Exclude 1 period: {result_robust1.att_overall:.4f}")
print(f"Exclude 2 periods: {result_robust2.att_overall:.4f}")
Minimum pre-treatment period requirements:
demean: At least 1 pre-treatment period remaining after exclusiondetrend: At least 2 pre-treatment periods remaining after exclusion
If any cohort has insufficient pre-treatment periods after applying
exclude_pre_periods, an InsufficientPrePeriodsError is raised.
Accessing Staggered Results
# Overall weighted effect
results.att_overall # ATT estimate
results.se_overall # Standard error
# Cohort-specific effects
df_cohorts = results.att_by_cohort
# Cohort-time specific effects
df_gt = results.att_by_cohort_time
# Event study visualization
fig, ax = results.plot_event_study()
Control Group Strategies
Not-yet-treated (default):
Uses both never-treated units and units that will be treated in future periods. More efficient but requires stronger no-anticipation assumption.
results = lwdid(..., control_group='not_yet_treated')
Never-treated only:
Uses only units that are never treated during the observation period. More robust but potentially less efficient.
results = lwdid(..., control_group='never_treated')
Estimator Selection
Regression Adjustment (RA):
Consistent when conditional mean is correctly specified
Default estimator, efficient under correct specification
Doubly Robust (IPWRA):
Consistent if either outcome model or propensity score is correct
Recommended when functional form assumptions are uncertain
results = lwdid(..., estimator='ipwra')
Inverse Probability Weighting (IPW):
Weights by propensity score
Consistent when propensity score is correctly specified
Propensity Score Matching (PSM):
Matches treated to control units by propensity score
Use when sample sizes are adequate for matching
Staggered Example
import pandas as pd
from lwdid import lwdid
# Load castle law data
data = pd.read_csv('castle.csv')
# Estimate with staggered design
results = lwdid(
data,
y='lhomicide',
ivar='state',
tvar='year',
gvar='effyear',
rolling='demean',
aggregate='overall',
control_group='not_yet_treated',
estimator='ipwra',
)
# Display results
print(results.summary())
# Event study plot
fig, ax = results.plot_event_study(
include_pre_treatment=True,
show_ci=True
)
Pre-treatment Dynamics and Parallel Trends Testing
Pre-treatment dynamics analysis enables assessment of the parallel trends assumption validity by estimating treatment effects in pre-treatment periods. Under the null hypothesis of parallel trends, these pre-treatment effects should be zero.
This feature implements the methodology from Lee and Wooldridge (2025), using rolling transformations that look forward to future pre-treatment periods rather than backward.
Why Pre-treatment Dynamics Matter
The parallel trends assumption is fundamental to difference-in-differences identification. While this assumption cannot be directly tested (it concerns counterfactual outcomes), examining pre-treatment dynamics provides indirect evidence:
Under parallel trends: Pre-treatment ATT estimates should be approximately zero (within sampling variation)
Violation of parallel trends: Systematic non-zero pre-treatment effects suggest the control group may not provide a valid counterfactual
The anchor point convention sets the effect at event time e = -1 (the period immediately before treatment) to exactly zero, providing a reference point for interpreting other pre-treatment effects.
Basic Usage
Enable pre-treatment dynamics by setting include_pretreatment=True:
from lwdid import lwdid
results = lwdid(
data,
y='outcome',
ivar='unit',
tvar='year',
gvar='first_treat',
rolling='demean',
aggregate='cohort',
include_pretreatment=True, # Enable pre-treatment estimation
pretreatment_test=True, # Run parallel trends test (default)
pretreatment_alpha=0.05, # Significance level (default)
)
# View complete summary including pre-treatment results
print(results.summary())
Accessing Pre-treatment Results
# Pre-treatment ATT estimates as DataFrame
pre_df = results.att_pre_treatment
print(pre_df)
# Columns: cohort, period, event_time, att, se, ci_lower, ci_upper,
# t_stat, pvalue, n_treated, n_control, is_anchor, rolling_window_size
# Parallel trends test results
pt = results.parallel_trends_test
print(f"Joint F-statistic: {pt.joint_f_stat:.4f}")
print(f"Joint p-value: {pt.joint_pvalue:.4f}")
print(f"Reject parallel trends: {pt.reject_null}")
print(f"Number of pre-treatment periods tested: {pt.n_periods}")
Interpreting the Parallel Trends Test
The parallel trends test performs a joint F-test of the null hypothesis that all pre-treatment ATT estimates (excluding the anchor point) are jointly zero:
H0: All pre-treatment ATT = 0 (parallel trends holds)
H1: At least one pre-treatment ATT ≠ 0 (parallel trends violated)
Interpretation guidelines:
reject_null=False: No statistical evidence against parallel trends. This does not prove parallel trends holds, but the data are consistent with it.reject_null=True: Evidence of pre-treatment differences. Consider:Using
detrendinstead ofdemeanto allow for heterogeneous trendsExamining which cohorts or periods show violations
Investigating potential anticipation effects or data issues
Event Study Visualization with Pre-treatment
The event study plot can display both pre-treatment and post-treatment effects:
# Plot with pre-treatment effects
fig, ax = results.plot_event_study(
include_pre_treatment=True, # Show pre-treatment effects
pre_treatment_color='gray', # Color for pre-treatment points
post_treatment_color='blue', # Color for post-treatment points
show_anchor_line=True, # Vertical line at e=-1
title='Event Study with Pre-treatment Dynamics',
ylabel='Treatment Effect',
)
fig.savefig('event_study_pretreatment.png', dpi=300)
The plot shows:
Pre-treatment effects (e < -1) in gray with confidence intervals
Anchor point (e = -1) at zero
Post-treatment effects (e ≥ 0) in blue with confidence intervals
Optional vertical dashed line at the anchor point
Complete Workflow Example
import pandas as pd
from lwdid import lwdid
# Load data
data = pd.read_csv('castle.csv')
# Step 1: Estimate with pre-treatment dynamics
results = lwdid(
data,
y='lhomicide',
ivar='state',
tvar='year',
gvar='effyear',
rolling='demean',
aggregate='cohort',
include_pretreatment=True,
pretreatment_test=True,
)
# Step 2: Check parallel trends test
pt = results.parallel_trends_test
if pt.reject_null:
print("Warning: Evidence against parallel trends!")
print(f"F-stat: {pt.joint_f_stat:.3f}, p-value: {pt.joint_pvalue:.4f}")
print("Consider using rolling='detrend' to allow heterogeneous trends")
else:
print("No evidence against parallel trends")
print(f"F-stat: {pt.joint_f_stat:.3f}, p-value: {pt.joint_pvalue:.4f}")
# Step 3: Examine pre-treatment effects
pre_df = results.att_pre_treatment
non_anchor = pre_df[~pre_df['is_anchor']]
print("\nPre-treatment effects (excluding anchor):")
print(non_anchor[['event_time', 'att', 'se', 'pvalue']].to_string(index=False))
# Step 4: Visualize
fig, ax = results.plot_event_study(
include_pre_treatment=True,
title='Castle Doctrine: Event Study with Pre-treatment',
)
fig.savefig('castle_event_study.png', dpi=300, bbox_inches='tight')
# Step 5: Report main results
print(f"\nOverall ATT: {results.att_overall:.4f} (SE: {results.se_overall:.4f})")
Transformation Methods for Pre-treatment
Pre-treatment dynamics support all four transformation methods (demean,
detrend, demeanq, detrendq):
Demean (default):
Uses rolling demeaning with future pre-treatment periods. For period t < g:
Detrend:
Uses rolling OLS detrending with future pre-treatment periods. Fits a linear trend using periods {t+1, …, g-1} and computes the residual.
The detrend option is appropriate when cohorts are suspected to have different pre-treatment trends.
Trend Diagnostics Tools
For exploratory analysis before running estimation, lwdid provides standalone
diagnostic functions in the trend_diagnostics module. These tools help assess
whether parallel trends holds and recommend an appropriate transformation method.
Testing Parallel Trends with Standalone Function
import lwdid
# Test parallel trends using placebo method
pt_result = lwdid.test_parallel_trends(
data=df,
y='outcome',
ivar='unit_id',
tvar='year',
gvar='first_treat',
method='placebo', # or 'regression', 'visual'
alpha=0.05
)
# Display summary
print(pt_result.summary())
# Access individual results
if pt_result.reject_null:
print(f"Parallel trends rejected (p={pt_result.pvalue:.4f})")
print(f"Recommendation: {pt_result.recommendation}")
else:
print("No evidence against parallel trends")
Diagnosing Heterogeneous Trends
# Diagnose trend heterogeneity across cohorts
ht_diag = lwdid.diagnose_heterogeneous_trends(
data=df,
y='outcome',
ivar='unit_id',
tvar='year',
gvar='first_treat'
)
# Display summary
print(ht_diag.summary())
# Check for heterogeneous trends
if ht_diag.has_heterogeneous_trends:
print("Heterogeneous trends detected")
print(f"Recommendation: {ht_diag.recommendation}")
Getting Automated Transformation Recommendation
# Get data-driven recommendation
rec = lwdid.recommend_transformation(
data=df,
y='outcome',
ivar='unit_id',
tvar='year',
gvar='first_treat',
verbose=True
)
print(f"Recommended method: {rec.recommended_method}")
print(f"Confidence: {rec.confidence:.1%}")
for reason in rec.reasons:
print(f" - {reason}")
# Use the recommendation
results = lwdid.lwdid(
data=df,
y='outcome',
ivar='unit_id',
tvar='year',
gvar='first_treat',
rolling=rec.recommended_method,
)
Visualizing Cohort Trends
# Plot outcome trajectories by cohort
fig = lwdid.plot_cohort_trends(
data=df,
y='outcome',
ivar='unit_id',
tvar='year',
gvar='first_treat',
normalize=True # Normalize to first period
)
fig.savefig('cohort_trends.png', dpi=300)
See Trend Diagnostics Module (trend_diagnostics) for complete API documentation.
Further Reading
Quick Start - Quick start tutorial
Methodological Notes - Theoretical background
API Reference - Complete API reference
Staggered DiD Module (staggered) - Staggered module API
Examples - Usage examples