User Guide

This guide covers all aspects of using the lwdid package for difference-in-differences estimation with rolling transformations.

Overview 

The lwdid package implements the Lee and Wooldridge methods for difference-in- differences estimation with panel data, covering three main scenarios:

Small-Sample Common Timing (Lee and Wooldridge 2026):

Exact t-based inference under CLM assumptions (normality and homoskedasticity)
Designed for settings with small numbers of treated or control units
Works best with large time dimensions

Large-Sample Common Timing (Lee and Wooldridge 2025):

Asymptotic inference with robust standard errors
Supports heteroskedasticity-robust (HC0-HC4) and cluster-robust options
Multiple estimators: RA, IPW, IPWRA, PSM

Staggered Adoption (Lee and Wooldridge 2025):

Units treated at different times
Cohort-time specific effect estimation
Flexible control group strategies and aggregation options

Key features:

Four transformation methods: demean, detrend, demeanq, detrendq
Multiple variance estimators for different assumptions
Randomization inference for small samples
Event study visualization for staggered designs

Scenario Selection Guide 

Use this guide to choose the appropriate scenario for your analysis:

Small-sample common timing: Use when N is small with common treatment timing. Key parameters: post, vce=None.
Large-sample common timing: Use when N is larger with common treatment timing. Key parameters: post, vce='hc3'.
Staggered adoption: Use when units are treated at different times. Key parameters: gvar, aggregate.

Decision Flowchart:

                ┌─────────────────────────────────┐
                │  Treatment timing varies        │
                │  across units?                  │
                └───────────────┬─────────────────┘
                                │
                ┌───────────────┴───────────────┐
                │                               │
                ▼                               ▼
           ┌────────┐                     ┌─────────┐
           │  YES   │                     │   NO    │
           └────┬───┘                     └────┬────┘
                │                              │
                ▼                              ▼
┌───────────────────────────┐   ┌─────────────────────────────┐
│  STAGGERED ADOPTION       │   │  Is cross-sectional N       │
│  Use: gvar parameter      │   │  small (< 50)?              │
│  Estimator: ra/ipwra      │   └───────────────┬─────────────┘
│  Aggregate: cohort/overall│                   │
└───────────────────────────┘   ┌───────────────┴───────────────┐
                                │                               │
                                ▼                               ▼
                           ┌────────┐                     ┌─────────┐
                           │  YES   │                     │   NO    │
                           └────┬───┘                     └────┬────┘
                                │                              │
                                ▼                              ▼
        ┌─────────────────────────────────────┐   ┌─────────────────────────────────┐
        │  SMALL-SAMPLE COMMON TIMING         │   │  LARGE-SAMPLE COMMON TIMING     │
        │  Use: post parameter                │   │  Use: post parameter            │
        │  VCE: None (exact t-inference)      │   │  VCE: hc3 or robust             │
        │  Alt: ri=True for RI                │   │  Estimator: ra/ipwra            │
        │  Reference: Lee and Wooldridge      │   │  Reference: Lee and Wooldridge  │
        │  (2026)                             │   │  (2025)                         │
        └─────────────────────────────────────┘   └─────────────────────────────────┘

Quick Reference 

For a quick start, see Quick Start. For complete API details, see API Reference.

Transformation Methods 

Four transformation methods convert panel data into cross-sectional form. The choice depends on data structure and assumptions.

demean: Standard DiD with Unit Fixed Effects 

When to use:

Annual data or data without seasonal patterns
Control for time-invariant unit characteristics
At least 1 pre-treatment period (\(T_0 \geq 1\))

What it does:

Removes unit-specific pre-treatment means from each observation.

Mathematical form (Lee and Wooldridge 2026):

For each unit \(i\), compute the pre-treatment mean \(\bar{Y}_{i,pre}\) and transform:

\[\dot{Y}_{it} = Y_{it} - \bar{Y}_{i,pre} \quad \text{for all } t\]

Then estimate a cross-sectional regression on the transformed data.

Example:

results = lwdid(
    data, y='outcome', d='treated', ivar='unit', tvar='year',
    post='post', rolling='demean'
)

detrend: DiD with Unit-Specific Linear Trends 

When to use:

Units exhibit different linear trends in the pre-treatment period
Control for both unit fixed effects and unit-specific trends
At least 2 pre-treatment periods (\(T_0 \geq 2\))

What it does:

Estimates and removes unit-specific linear trends from the pre-treatment data.

Mathematical form (Lee and Wooldridge 2026):

For each unit \(i\), estimate linear trend from pre-treatment data:

\[Y_{it} = \alpha_i + \beta_i t + \varepsilon_{it} \quad \text{(for } t \text{ in pre-treatment period)}\]

Then detrend all observations using the estimated trend.

Example:

results = lwdid(
    data, y='outcome', d='treated', ivar='unit', tvar='year',
    post='post', rolling='detrend'
)

demeanq: Seasonal Data with Seasonal Fixed Effects 

When to use:

Periodic data with seasonal patterns (quarterly, monthly, or weekly)
Control for both unit and season-of-year effects
Pre-treatment sample rich enough to estimate seasonal fixed effects for each unit: for each unit, the number of pre-treatment observations must be at least Q + 1 (where Q is the number of seasons per year)
For each unit, every season that appears in the post-treatment period must also appear in its pre-treatment period (season-coverage condition).

What it does:

Removes unit-specific pre-treatment means and season-of-year effects.

Supported seasonal periods (Q parameter):

Q=4 (default): Quarterly data (4 seasons per year)
Q=12: Monthly data (12 seasons per year)
Q=52: Weekly data (52 seasons per year)

Data requirements:

For quarterly data: Pass tvar as a list: tvar=['year', 'quarter']
For monthly/weekly data: Use a single time variable with Q and season_var parameters

Examples:

Quarterly data (Q=4, default):

results = lwdid(
    data_q, y='sales', d='treated', ivar='store',
    tvar=['year', 'quarter'],  # Composite time variable
    post='post', rolling='demeanq'
)

Monthly data (Q=12):

results = lwdid(
    data_m, y='sales', d='treated', ivar='store',
    tvar='time',              # Single time index
    post='post', rolling='demeanq',
    Q=12,                     # 12 seasons per year
    season_var='month'        # Month indicator (1-12)
)

Weekly data (Q=52):

results = lwdid(
    data_w, y='sales', d='treated', ivar='store',
    tvar='time',              # Single time index
    post='post', rolling='demeanq',
    Q=52,                     # 52 seasons per year
    season_var='week'         # Week indicator (1-52)
)

detrendq: Seasonal Data with Trends and Seasonal Effects 

When to use:

Periodic data with both trends and seasonal patterns
Control for unit trends and season-of-year effects
At least Q + 2 pre-treatment observations per unit (where Q is the number of seasons per year)
For each unit, every season that appears in the post-treatment period must also appear in its pre-treatment period (season-coverage condition).

What it does:

Combines detrending with seasonal adjustment.

Supported seasonal periods (Q parameter):

Q=4 (default): Quarterly data (4 seasons per year)
Q=12: Monthly data (12 seasons per year)
Q=52: Weekly data (52 seasons per year)

Examples:

Quarterly data (Q=4, default):

results = lwdid(
    data_q, y='sales', d='treated', ivar='store',
    tvar=['year', 'quarter'],
    post='post', rolling='detrendq'
)

Monthly data (Q=12):

results = lwdid(
    data_m, y='sales', d='treated', ivar='store',
    tvar='time',
    post='post', rolling='detrendq',
    Q=12,
    season_var='month'
)

Weekly data (Q=52):

results = lwdid(
    data_w, y='sales', d='treated', ivar='store',
    tvar='time',
    post='post', rolling='detrendq',
    Q=52,
    season_var='week'
)

Automatic Frequency Detection 

For convenience, lwdid can automatically detect the data frequency and set the appropriate Q parameter:

results = lwdid(
    data, y='sales', d='treated', ivar='store',
    tvar='time',
    post='post', rolling='demeanq',
    auto_detect_frequency=True    # Automatically detect Q
)

When auto_detect_frequency=True, the package examines the time variable to determine the most likely data frequency:

Detects monthly data (Q=12) based on 12 observations per year pattern
Detects weekly data (Q=52) based on 52 observations per year pattern
Falls back to quarterly (Q=4) if frequency cannot be determined

Note: If both an explicit Q (other than the default 4) and auto_detect_frequency=True are specified, and the detected frequency differs from the explicit Q, a warning is issued and the explicit Q value takes precedence. No error is raised.

Variance Estimation 

The package supports multiple variance estimators for different assumptions about the error structure.

OLS (Homoskedastic)

When to use:

Errors are homoskedastic and normally distributed
Exact t-based inference is desired

Specification:

results = lwdid(..., vce=None)  # Default

Degrees of freedom:

Non-clustered: df equals the residual degrees of freedom from the cross-sectional regression (df_resid). In a simple specification without controls this is N - 2 (intercept and treatment indicator); with controls and interactions it is N minus the total number of estimated parameters.
Clustered: df = G - 1 (G clusters)

HC0 (White’s Heteroskedasticity-Consistent)

When to use:

Heteroskedasticity is suspected
Sample size is large

Specification:

results = lwdid(..., vce='hc0')

Note: HC0 is the original heteroskedasticity-consistent estimator without finite-sample adjustments.

HC1 (Heteroskedasticity-Robust)

When to use:

Heteroskedasticity is suspected
Sample size is moderate to large, so asymptotic approximations are more reliable

Specification:

results = lwdid(..., vce='hc1')
# or equivalently:
results = lwdid(..., vce='robust')

Note: 'hc1' and 'robust' are aliases for the same estimator. HC1 applies a degrees-of-freedom correction (n/(n-k)) to HC0.

HC2 (Leverage-Adjusted)

When to use:

Heteroskedasticity is suspected
Sample size is small to moderate
Observations have varying leverage

Specification:

results = lwdid(..., vce='hc2')

Note: HC2 divides squared residuals by (1 - h_ii) where h_ii is the diagonal of the hat matrix. Provides better performance than HC1 when leverage varies across observations.

HC3 (Small-Sample Adjusted Robust)

When to use:

Heteroskedasticity is suspected
Sample size is small or moderate

Specification:

results = lwdid(..., vce='hc3')

Advantage: HC3 divides squared residuals by (1 - h_ii)^2, providing better finite-sample performance than HC1 and HC2 in small samples.

HC4 (High-Leverage Adjusted)

When to use:

Heteroskedasticity is suspected
Data contains high-leverage observations

Specification:

results = lwdid(..., vce='hc4')

Note: HC4 uses an adaptive exponent that increases the adjustment for observations with high leverage. Recommended when the design matrix contains potentially influential observations.

Cluster-Robust 

When to use:

Errors are correlated within clusters (e.g., states, regions)
Multiple units per cluster

Specification:

results = lwdid(..., vce='cluster', cluster_var='state')

Degrees of freedom: G - 1, where G is the number of clusters.

Clustering Level Selection: Lee and Wooldridge (2025) provides guidance on choosing the clustering level. The clustering level should match the level at which the policy varies:

If the unit of observation is county but policy varies at the state level, cluster at the state level
If the unit of observation is individual but policy varies at the firm level, cluster at the firm level

In general, cluster at the highest level where treatment assignment is determined. This ensures that standard errors properly account for the correlation structure induced by the policy variation.

Randomization Inference 

Randomization inference provides an alternative to t-based inference without distributional assumptions.

Basic Usage 

results = lwdid(
    data, y='outcome', d='treated', ivar='unit', tvar='year',
    post='post', rolling='demean',
    ri=True,        # Enable RI
    rireps=1000,    # Number of randomization replications (default: 1000)
    seed=42         # Random seed for reproducibility
)

print(f"RI p-value: {results.ri_pvalue:.4f}")

RI Methods 

The package supports two RI methods:

Bootstrap (default):

results = lwdid(..., ri=True, ri_method='bootstrap')

With-replacement resampling.

Permutation:

results = lwdid(..., ri=True, ri_method='permutation')

Fisher randomization inference without replacement.

Accessing RI Results 

results.ri_pvalue          # RI p-value (available if ri=True)
results.ri_method          # RI method ('permutation' or 'bootstrap')
results.rireps             # Number of RI replications
results.ri_valid           # Number of valid RI replications
results.ri_failed          # Number of failed RI replications
results.ri_seed            # Random seed actually used for RI (auto-generated if not provided)

Control Variables 

Time-invariant control variables can be included in the estimation.

Basic Usage 

results = lwdid(
    data, y='outcome', d='treated', ivar='unit', tvar='year',
    post='post', rolling='demean',
    controls=['population', 'income', 'education']
)

Requirements 

Controls must be time-invariant (constant within each unit)
Controls are automatically centered (demeaned)
Missing values in controls are handled at the estimation stage: observations with missing controls may be dropped if doing so still leaves enough treated and control units to satisfy the N1>K+1 and N0>K+1 conditions; otherwise controls are omitted and the full regression sample is retained (with a warning)

Effect on Inference 

Including controls:

Reduces residual variance (potentially increasing power)
Reduces degrees of freedom by the number of added regressors (controls and their interactions)
Controls are centered before inclusion in the cross-sectional regression
Controls are included only if both the treated and control groups satisfy N1>K+1 and N0>K+1 (where K is the number of controls); otherwise, controls are automatically excluded and a warning is issued

Working with Results 

The lwdid() function returns a LWDIDResults object with comprehensive estimation results and export capabilities.

Core Estimates 

results.att           # Average treatment effect on treated
results.se_att        # Standard error
results.t_stat        # t-statistic
results.pvalue        # Two-sided p-value
results.ci_lower      # 95% CI lower bound
results.ci_upper      # 95% CI upper bound
results.df_inference  # Degrees of freedom

Period-Specific Effects 

# Access period-by-period estimates
df_periods = results.att_by_period

# DataFrame with columns:
# - period: Time period label
# - tindex: Numeric time index
# - beta: Treatment effect estimate
# - se: Standard error
# - ci_lower, ci_upper: 95% confidence interval
# - tstat: t-statistic
# - pval: p-value
# - N: Sample size
# The first row summarizes the overall average effect ('period' == 'average').

Sample Information 

results.nobs          # Total observations in regression
results.n_treated     # Number of treated units
results.n_control     # Number of control units
results.K             # Number of pre-treatment periods
results.tpost1        # First post-treatment period index

Visualization 

# Plot treatment effects over time
results.plot()

# Customize the plot
results.plot(graph_options={'figsize': (10, 6), 'title': 'Treatment Effects Over Time', 'ylabel': 'Effect Size'})

Export Results 

Excel:

results.to_excel('results.xlsx')
# Creates sheets: Summary, ByPeriod, and RI (if available)

CSV:

results.to_csv('results.csv')
# Exports period-by-period estimates

LaTeX:

results.to_latex('table.tex')
# Creates publication-ready LaTeX table

Print Summary 

results.summary()  # Prints formatted summary to console

Data Requirements and Validation 

Panel Structure 

Required format:

Long format: one row per (unit, time) observation
Unique (unit, time) combinations
Continuous time sequence (no gaps in time index)

Validation:

The package automatically validates:

No duplicate (unit, time) pairs
Time index forms continuous sequence
Sufficient pre-treatment periods for chosen transformation

Treatment Structure 

Common timing assumption:

All treated units must start treatment in the same period. The post variable must be a function of time only.

Treatment persistence:

Once post switches from 0 to 1, it must remain 1 for all subsequent periods (no reversals).

Validation:

The package checks:

post is binary (0 or 1)
post is time-invariant across units in each period
No treatment reversals

Treatment Indicator 

The d parameter must be a time-invariant treatment group indicator:

d = 1 for all periods if unit is in treatment group
d = 0 for all periods if unit is in control group
d should NOT vary over time

Common mistake: Passing time-varying treatment status instead of group indicator.

Missing Data 

Allowed:

Missing values in outcome variable y (observations dropped)
Unbalanced panels (different units can have different numbers of periods)

Not allowed:

Rows with missing values in d, ivar, tvar, or post (these observations are dropped during validation)
Gaps in time sequence

Handling Unbalanced Panels 

When the panel is unbalanced (different units observed in different periods), the balanced_panel parameter controls behavior:

balanced_panel='warn' (default): Issue a warning and proceed with estimation
balanced_panel='error': Raise an exception; require a balanced panel
balanced_panel='ignore': Silently proceed without any warning

# Default: warn about unbalanced panel but proceed
results = lwdid(data, y='outcome', ..., balanced_panel='warn')

# Strict: require balanced panel
results = lwdid(data, y='outcome', ..., balanced_panel='error')

# Silent: ignore panel imbalance
results = lwdid(data, y='outcome', ..., balanced_panel='ignore')

Selection mechanism assumption: With unbalanced panels, selection into the sample may depend on time-invariant characteristics (removed by the transformation), but should not depend on outcome shocks. See Methodological Notes for details.

Tip: Use detrend instead of demean for additional robustness to selection bias in unbalanced panels, as detrending removes both level and trend heterogeneity.

For diagnosing potential selection issues, use the selection diagnostics tools:

from lwdid import diagnose_selection_mechanism

diag = diagnose_selection_mechanism(data, ivar='unit', tvar='year')
print(diag.summary())

See Selection Diagnostics Module (selection_diagnostics) for detailed documentation.

Never-Treated Unit Data Preparation 

In staggered adoption designs, never-treated units serve as the control group. Proper encoding of never-treated units is essential for correct estimation.

Valid Never-Treated Encodings 

The gvar column can encode never-treated units using any of these values:

Zero (0): Common in Stata and other software

data['gvar'] = data['gvar'].fillna(0)  # Convert NaN to 0

Positive infinity (np.inf): Represents “treated at infinity”

import numpy as np
data.loc[data['never_treated'], 'gvar'] = np.inf

NaN/NA/None: Missing treatment time indicates never-treated

# NaN is automatically recognized as never-treated
data['gvar'] = data['first_treat_year']  # NaN for never-treated

All three encodings are equivalent and produce identical results.

Converting from Stata Data 

When importing data from Stata, never-treated units may be encoded differently:

import pandas as pd
import numpy as np

# Load Stata data
data = pd.read_stata('staggered_data.dta')

# Common Stata encodings for never-treated:
# - Missing values (.) → automatically become NaN in pandas
# - Zero (0) → recognized as never-treated
# - Large values (9999) → need manual conversion

# Convert custom encoding to standard format
data['gvar'] = data['gvar'].replace({9999: 0, -1: 0})

# Verify never-treated identification
from lwdid.validation import is_never_treated
unit_gvar = data.groupby('unit')['gvar'].first()
n_nt = unit_gvar.apply(is_never_treated).sum()
print(f"Never-treated units: {n_nt}")

Checking Never-Treated Status 

Use the is_never_treated() function to verify encoding:

from lwdid.validation import is_never_treated
import numpy as np

# Check individual values
print(is_never_treated(0))        # True
print(is_never_treated(np.inf))   # True
print(is_never_treated(np.nan))   # True
print(is_never_treated(2005))     # False

# Check DataFrame
unit_gvar = data.groupby('unit')['gvar'].first()
nt_mask = unit_gvar.apply(is_never_treated)

print(f"Total units: {len(unit_gvar)}")
print(f"Never-treated: {nt_mask.sum()}")
print(f"Treated: {(~nt_mask).sum()}")

Common Issues and Solutions 

Issue: No never-treated units found

# Check gvar values
print(data['gvar'].unique())

# If using custom encoding (e.g., 9999 for never-treated)
data['gvar'] = data['gvar'].replace({9999: 0})

Issue: Negative gvar values

# Negative values are invalid
if (data['gvar'] < 0).any():
    print("Warning: Negative gvar values found")
    # Convert to valid encoding
    data.loc[data['gvar'] < 0, 'gvar'] = 0

Issue: gvar varies within unit

# gvar must be time-invariant
gvar_varies = data.groupby('unit')['gvar'].nunique() > 1
if gvar_varies.any():
    print(f"Units with varying gvar: {gvar_varies.sum()}")
    # Use first observation's gvar
    first_gvar = data.groupby('unit')['gvar'].first()
    data['gvar'] = data['unit'].map(first_gvar)

Best Practices 

Choosing Transformation Method 

Start with ``demean`` for annual data
Use ``detrend`` if pre-treatment trends differ across units
Use ``demeanq``/``detrendq`` for quarterly data with seasonality
Check pre-treatment period requirements (\(T_0 \geq 1\) for demean, \(T_0 \geq 2\) for detrend)

Choosing Variance Estimator 

Default (``vce=None``): Appropriate when homoskedasticity is plausible
``vce=’hc3’``: Recommended for small to moderate sample sizes when heteroskedasticity is suspected
``vce=’cluster’``: Appropriate when errors are clustered
Randomization inference: Use as robustness check or when normality is doubtful

Sample Size Considerations 

Small samples (N < 50): Use exact t-based inference (vce=None) under CLM assumptions (normality and homoskedasticity). Minimum requirement is \(N \geq 3\) units. Having at least 10 units improves stability.
Moderate to large samples (N >= 50): Use heteroskedasticity-robust standard errors (vce='hc3') for valid asymptotic inference.
Large samples with controls: Consider IPWRA (estimator='ipwra') for doubly robust estimation when functional form assumptions are uncertain.
Time dimension: Method works best with large T (many time periods), where the central limit theorem across time supports normality of the transformed outcome.
Asymptotic theory: Lee and Wooldridge (2025) develops the theoretical foundation for large-sample inference, supporting HC0-HC4 and cluster-robust standard errors.

Sample Size and Inference Method Selection Matrix 

The following matrix provides guidance on selecting the appropriate estimator and inference method based on sample size:

Small Samples (N < 50):

Recommended: RA with vce=None (exact t-based inference)
Alternative: RA with ri=True (randomization inference)
VCE options: vce=None or vce='hc3'
Inference distribution: t-distribution
Key assumption: CLM assumptions (normality, homoskedasticity) for exact inference

Moderate Samples (50 ≤ N < 200):

Recommended: RA or IPWRA with vce='hc3'
VCE options: vce='hc1', vce='hc3', or vce='cluster'
Inference distribution: t-distribution (RA) or normal (IPW/IPWRA/PSM)
Key consideration: HC3 provides better finite-sample performance

Large Samples (N ≥ 200):

Recommended: IPWRA with vce='hc1' or vce='robust'
Alternative: RA, IPW, or PSM depending on substantive considerations
VCE options: All HC variants and cluster-robust
Inference distribution: t-distribution (RA) or normal (IPW/IPWRA/PSM)
Key advantage: Double robustness of IPWRA protects against misspecification

Inference Distribution by Estimator:

RA (Regression Adjustment): t-distribution with df = N - k
IPW (Inverse Probability Weighting): Normal distribution (asymptotic)
IPWRA (Doubly Robust): Normal distribution (asymptotic)
PSM (Propensity Score Matching): Normal distribution (asymptotic)

For detailed theoretical foundations, see Methodological Notes.

Estimator Selection (Large-Sample Common Timing)

For large cross-sectional samples in common timing settings, multiple estimators are available beyond regression adjustment. Lee and Wooldridge (2025) shows that the rolling transformation approach enables application of various treatment effects estimators, including doubly robust methods.

Available estimators for common timing:

'ra' (default): Regression adjustment via OLS. Consistent when the outcome model is correctly specified. Efficient under correct specification. Uses t-distribution for inference.
'ipw': Inverse probability weighting. Consistent when the propensity score model is correctly specified. Requires controls parameter. Uses normal distribution for asymptotic inference.
'ipwra': Doubly robust estimator combining regression and IPW. Consistent when either the outcome model or propensity score is correctly specified. Recommended when functional form assumptions are uncertain. Uses normal distribution for asymptotic inference.
'psm': Propensity score matching. Matches treated to control units based on propensity scores. Requires adequate sample sizes for matching. Uses normal distribution for asymptotic inference.

Example (IPWRA in common timing):

results = lwdid(
    data, y='outcome', d='treated', ivar='unit', tvar='year',
    post='post', rolling='demean',
    estimator='ipwra',           # Doubly robust estimator
    controls=['x1', 'x2'],       # Controls for outcome and PS models
    vce='hc3'                    # Robust standard errors
)

Estimator selection guidelines:

Small samples (N < 50): Use RA with vce=None for exact t-based inference under CLM assumptions
Moderate samples (50 ≤ N < 200): Start with RA and vce='hc3'; use IPWRA as robustness check
Large samples (N ≥ 200): Use IPWRA as primary estimator when functional form assumptions are uncertain
All sample sizes: Use randomization inference (ri=True) as a robustness check that does not rely on distributional assumptions

Diagnostic Checks 

Visual inspection: Plot outcome trends for treated vs. control units
Pre-treatment balance: Check if treated and control units are similar pre-treatment
Parallel trends: Examine pre-treatment period-specific effects (should be near zero)
Sensitivity: Try different transformation methods and variance estimators

Troubleshooting 

Common Errors 

“Insufficient pre-treatment periods”:

demean/demeanq require \(T_0 \geq 1\)
detrend/detrendq require \(T_0 \geq 2\)
Solution: Check your post variable or use a different transformation

“Treatment timing is not common across units”:

post must be the same for all units in each period
Solution: Verify your post variable is time-based, not unit-specific

“Control variable varies within unit”:

Controls must be time-invariant
Solution: Use only time-invariant controls or create unit-level averages

“Duplicate (unit, time) observations”:

Each (unit, time) combination must appear only once
Solution: Check for data duplication or aggregation issues

Performance Tips 

For large rireps (e.g., 10,000), RI can be slow
Use seed parameter for reproducible RI results
Export results to Excel/CSV for further analysis in other tools

Staggered Adoption Design 

When units are treated at different times, use the staggered adoption framework.

Note

Staggered transformation options: Only demean and detrend are supported for staggered adoption designs through the lwdid() interface. Seasonal transformations (demeanq, detrendq) are restricted to common timing mode; passing them with gvar raises a ValueError.

Basic Usage 

Instead of post (common timing), specify gvar (first treatment period):

results = lwdid(
    data,
    y='outcome',
    ivar='unit',
    tvar='year',
    gvar='first_treat_year',  # First treatment period for each unit
    rolling='demean',
    aggregate='overall',      # Aggregate to overall effect
)

Staggered-Specific Parameters 

gvarstr

Column name indicating the first treatment period for each unit. Units with gvar = 0, gvar = inf, or gvar = NaN are never treated.

aggregate{‘none’, ‘cohort’, ‘overall’}

'none': Return (g,r)-specific effects only
'cohort': Average effects within each cohort (default)
'overall': Weighted average across cohorts

control_group{‘not_yet_treated’, ‘never_treated’, ‘all_others’}

'not_yet_treated': Use never-treated plus not-yet-treated units (default)
'never_treated': Use only never-treated units
'all_others': All units not in the treated cohort (including already-treated units). Primarily for replication and diagnostics; may introduce forbidden comparisons.

estimator{‘ra’, ‘ipw’, ‘ipwra’, ‘psm’}

'ra': Regression adjustment (default)
'ipw': Inverse probability weighting
'ipwra': Doubly robust (recommended)
'psm': Propensity score matching

exclude_pre_periodsint, default 0

Number of periods immediately before treatment to exclude from pre-treatment calculations. Used for robustness checks when the no-anticipation assumption may be violated.

For cohort g, the pre-treatment mean/trend is computed using periods {T_min, …, g-1-exclude_pre_periods} instead of {T_min, …, g-1}.

See Lee and Wooldridge (2025) for methodological details.

Robustness to No-Anticipation Violations 

When the no-anticipation assumption may be violated (e.g., units adjust behavior before formal treatment), the exclude_pre_periods parameter can be used to perform robustness checks.

Example:

# Baseline estimation
result_baseline = lwdid(
    data, y='outcome', ivar='unit', tvar='year',
    gvar='cohort', rolling='demean'
)

# Robustness check: exclude 1 period before treatment
result_robust1 = lwdid(
    data, y='outcome', ivar='unit', tvar='year',
    gvar='cohort', rolling='demean',
    exclude_pre_periods=1
)

# Robustness check: exclude 2 periods before treatment
result_robust2 = lwdid(
    data, y='outcome', ivar='unit', tvar='year',
    gvar='cohort', rolling='demean',
    exclude_pre_periods=2
)

# Compare results
print(f"Baseline ATT: {result_baseline.att_overall:.4f}")
print(f"Exclude 1 period: {result_robust1.att_overall:.4f}")
print(f"Exclude 2 periods: {result_robust2.att_overall:.4f}")

Minimum pre-treatment period requirements:

demean: At least 1 pre-treatment period remaining after exclusion
detrend: At least 2 pre-treatment periods remaining after exclusion

If any cohort has insufficient pre-treatment periods after applying exclude_pre_periods, an InsufficientPrePeriodsError is raised.

Accessing Staggered Results 

# Overall weighted effect
results.att_overall       # ATT estimate
results.se_overall        # Standard error

# Cohort-specific effects
df_cohorts = results.att_by_cohort

# Cohort-time specific effects
df_gt = results.att_by_cohort_time

# Event study visualization
fig, ax = results.plot_event_study()

Control Group Strategies 

Not-yet-treated (default):

Uses both never-treated units and units that will be treated in future periods. More efficient but requires stronger no-anticipation assumption.

results = lwdid(..., control_group='not_yet_treated')

Never-treated only:

Uses only units that are never treated during the observation period. More robust but potentially less efficient.

results = lwdid(..., control_group='never_treated')

Estimator Selection 

Regression Adjustment (RA):

Consistent when conditional mean is correctly specified
Default estimator, efficient under correct specification

Doubly Robust (IPWRA):

Consistent if either outcome model or propensity score is correct
Recommended when functional form assumptions are uncertain

results = lwdid(..., estimator='ipwra')

Inverse Probability Weighting (IPW):

Weights by propensity score
Consistent when propensity score is correctly specified

Propensity Score Matching (PSM):

Matches treated to control units by propensity score
Use when sample sizes are adequate for matching

Staggered Example 

import pandas as pd
from lwdid import lwdid

# Load castle law data
data = pd.read_csv('castle.csv')

# Estimate with staggered design
results = lwdid(
    data,
    y='lhomicide',
    ivar='state',
    tvar='year',
    gvar='effyear',
    rolling='demean',
    aggregate='overall',
    control_group='not_yet_treated',
    estimator='ipwra',
)

# Display results
print(results.summary())

# Event study plot
fig, ax = results.plot_event_study(
    include_pre_treatment=True,
    show_ci=True
)

Pre-treatment Dynamics and Parallel Trends Testing 

Pre-treatment dynamics analysis enables assessment of the parallel trends assumption validity by estimating treatment effects in pre-treatment periods. Under the null hypothesis of parallel trends, these pre-treatment effects should be zero.

This feature implements the methodology from Lee and Wooldridge (2025), using rolling transformations that look forward to future pre-treatment periods rather than backward.

Why Pre-treatment Dynamics Matter 

The parallel trends assumption is fundamental to difference-in-differences identification. While this assumption cannot be directly tested (it concerns counterfactual outcomes), examining pre-treatment dynamics provides indirect evidence:

Under parallel trends: Pre-treatment ATT estimates should be approximately zero (within sampling variation)
Violation of parallel trends: Systematic non-zero pre-treatment effects suggest the control group may not provide a valid counterfactual

The anchor point convention sets the effect at event time e = -1 (the period immediately before treatment) to exactly zero, providing a reference point for interpreting other pre-treatment effects.

Basic Usage 

Enable pre-treatment dynamics by setting include_pretreatment=True:

from lwdid import lwdid

results = lwdid(
    data,
    y='outcome',
    ivar='unit',
    tvar='year',
    gvar='first_treat',
    rolling='demean',
    aggregate='cohort',
    include_pretreatment=True,    # Enable pre-treatment estimation
    pretreatment_test=True,       # Run parallel trends test (default)
    pretreatment_alpha=0.05,      # Significance level (default)
)

# View complete summary including pre-treatment results
print(results.summary())

Accessing Pre-treatment Results 

# Pre-treatment ATT estimates as DataFrame
pre_df = results.att_pre_treatment
print(pre_df)
# Columns: cohort, period, event_time, att, se, ci_lower, ci_upper,
#          t_stat, pvalue, n_treated, n_control, is_anchor, rolling_window_size

# Parallel trends test results
pt = results.parallel_trends_test
print(f"Joint F-statistic: {pt.joint_f_stat:.4f}")
print(f"Joint p-value: {pt.joint_pvalue:.4f}")
print(f"Reject parallel trends: {pt.reject_null}")
print(f"Number of pre-treatment periods tested: {pt.n_periods}")

Interpreting the Parallel Trends Test 

The parallel trends test performs a joint F-test of the null hypothesis that all pre-treatment ATT estimates (excluding the anchor point) are jointly zero:

H0: All pre-treatment ATT = 0 (parallel trends holds)
H1: At least one pre-treatment ATT ≠ 0 (parallel trends violated)

Interpretation guidelines:

reject_null=False: No statistical evidence against parallel trends. This does not prove parallel trends holds, but the data are consistent with it.
reject_null=True: Evidence of pre-treatment differences. Consider:
- Using detrend instead of demean to allow for heterogeneous trends
- Examining which cohorts or periods show violations
- Investigating potential anticipation effects or data issues

Event Study Visualization with Pre-treatment 

The event study plot can display both pre-treatment and post-treatment effects:

# Plot with pre-treatment effects
fig, ax = results.plot_event_study(
    include_pre_treatment=True,      # Show pre-treatment effects
    pre_treatment_color='gray',      # Color for pre-treatment points
    post_treatment_color='blue',     # Color for post-treatment points
    show_anchor_line=True,           # Vertical line at e=-1
    title='Event Study with Pre-treatment Dynamics',
    ylabel='Treatment Effect',
)
fig.savefig('event_study_pretreatment.png', dpi=300)

The plot shows:

Pre-treatment effects (e < -1) in gray with confidence intervals
Anchor point (e = -1) at zero
Post-treatment effects (e ≥ 0) in blue with confidence intervals
Optional vertical dashed line at the anchor point

Complete Workflow Example 

import pandas as pd
from lwdid import lwdid

# Load data
data = pd.read_csv('castle.csv')

# Step 1: Estimate with pre-treatment dynamics
results = lwdid(
    data,
    y='lhomicide',
    ivar='state',
    tvar='year',
    gvar='effyear',
    rolling='demean',
    aggregate='cohort',
    include_pretreatment=True,
    pretreatment_test=True,
)

# Step 2: Check parallel trends test
pt = results.parallel_trends_test
if pt.reject_null:
    print("Warning: Evidence against parallel trends!")
    print(f"F-stat: {pt.joint_f_stat:.3f}, p-value: {pt.joint_pvalue:.4f}")
    print("Consider using rolling='detrend' to allow heterogeneous trends")
else:
    print("No evidence against parallel trends")
    print(f"F-stat: {pt.joint_f_stat:.3f}, p-value: {pt.joint_pvalue:.4f}")

# Step 3: Examine pre-treatment effects
pre_df = results.att_pre_treatment
non_anchor = pre_df[~pre_df['is_anchor']]
print("\nPre-treatment effects (excluding anchor):")
print(non_anchor[['event_time', 'att', 'se', 'pvalue']].to_string(index=False))

# Step 4: Visualize
fig, ax = results.plot_event_study(
    include_pre_treatment=True,
    title='Castle Doctrine: Event Study with Pre-treatment',
)
fig.savefig('castle_event_study.png', dpi=300, bbox_inches='tight')

# Step 5: Report main results
print(f"\nOverall ATT: {results.att_overall:.4f} (SE: {results.se_overall:.4f})")

Transformation Methods for Pre-treatment 

Pre-treatment dynamics support all four transformation methods (demean, detrend, demeanq, detrendq):

Demean (default):

Uses rolling demeaning with future pre-treatment periods. For period t < g:

\[\dot{Y}_{itg} = Y_{it} - \frac{1}{g-1-t} \sum_{q=t+1}^{g-1} Y_{iq}\]

Detrend:

Uses rolling OLS detrending with future pre-treatment periods. Fits a linear trend using periods {t+1, …, g-1} and computes the residual.

The detrend option is appropriate when cohorts are suspected to have different pre-treatment trends.

Trend Diagnostics Tools 

For exploratory analysis before running estimation, lwdid provides standalone diagnostic functions in the trend_diagnostics module. These tools help assess whether parallel trends holds and recommend an appropriate transformation method.

Testing Parallel Trends with Standalone Function 

import lwdid

# Test parallel trends using placebo method
pt_result = lwdid.test_parallel_trends(
    data=df,
    y='outcome',
    ivar='unit_id',
    tvar='year',
    gvar='first_treat',
    method='placebo',   # or 'regression', 'visual'
    alpha=0.05
)

# Display summary
print(pt_result.summary())

# Access individual results
if pt_result.reject_null:
    print(f"Parallel trends rejected (p={pt_result.pvalue:.4f})")
    print(f"Recommendation: {pt_result.recommendation}")
else:
    print("No evidence against parallel trends")

Diagnosing Heterogeneous Trends 

# Diagnose trend heterogeneity across cohorts
ht_diag = lwdid.diagnose_heterogeneous_trends(
    data=df,
    y='outcome',
    ivar='unit_id',
    tvar='year',
    gvar='first_treat'
)

# Display summary
print(ht_diag.summary())

# Check for heterogeneous trends
if ht_diag.has_heterogeneous_trends:
    print("Heterogeneous trends detected")
    print(f"Recommendation: {ht_diag.recommendation}")

Getting Automated Transformation Recommendation 

# Get data-driven recommendation
rec = lwdid.recommend_transformation(
    data=df,
    y='outcome',
    ivar='unit_id',
    tvar='year',
    gvar='first_treat',
    verbose=True
)

print(f"Recommended method: {rec.recommended_method}")
print(f"Confidence: {rec.confidence:.1%}")
for reason in rec.reasons:
    print(f"  - {reason}")

# Use the recommendation
results = lwdid.lwdid(
    data=df,
    y='outcome',
    ivar='unit_id',
    tvar='year',
    gvar='first_treat',
    rolling=rec.recommended_method,
)

Visualizing Cohort Trends 

# Plot outcome trajectories by cohort
fig = lwdid.plot_cohort_trends(
    data=df,
    y='outcome',
    ivar='unit_id',
    tvar='year',
    gvar='first_treat',
    normalize=True    # Normalize to first period
)
fig.savefig('cohort_trends.png', dpi=300)

See Trend Diagnostics Module (trend_diagnostics) for complete API documentation.