Trend Diagnostics Module (trend_diagnostics)
The trend diagnostics module provides tools for assessing the parallel trends assumption and detecting heterogeneous trends across treatment cohorts in difference-in-differences settings.
Overview
This module supports three main diagnostic workflows:
Parallel trends testing: Assess whether treated and control groups exhibit similar outcome trends prior to treatment.
Heterogeneous trends diagnosis: Detect cohort-specific linear trends that may violate the standard parallel trends assumption.
Transformation recommendation: Provide data-driven guidance on whether to use demeaning or detrending based on diagnostic results.
The conditional heterogeneous trends (CHT) framework from Lee and Wooldridge (2025) allows each treatment cohort to have its own linear trend, relaxing the standard parallel trends assumption. When CHT holds but parallel trends fails, detrending removes cohort-specific linear trends and restores consistency.
Testing Functions
test_parallel_trends
- lwdid.trend_diagnostics.test_parallel_trends(data, y, ivar, tvar, gvar=None, controls=None, method='placebo', estimator='ra', alpha=0.05, n_bootstrap=0, never_treated_values=None, rolling='demean', verbose=True)[source]
Test the parallel trends assumption.
Estimates placebo treatment effects in pre-treatment periods to assess whether the parallel trends assumption holds. Under the null hypothesis of parallel trends, all pre-treatment ATT estimates should be statistically indistinguishable from zero.
This function uses rolling transformations (not simple 2x2 DiD) to properly estimate pre-treatment ATTs with correct standard errors that account for the panel structure.
- Parameters:
data (pd.DataFrame) – Panel data in long format.
y (str) – Outcome variable column name.
ivar (str) – Unit identifier column name.
tvar (str) – Time variable column name.
gvar (str, optional) – Cohort variable for staggered designs. If None, assumes common timing.
controls (list of str, optional) – Control variable column names.
method (str, default 'placebo') – Testing method: - ‘placebo’: Estimate pre-treatment ATTs using rolling transformation - ‘regression’: Formal regression-based test for trend differences - ‘visual’: Generate pre-trends plot only - ‘joint’: Combine placebo and regression tests
estimator (str, default 'ra') – Estimator for ATT: ‘ra’, ‘ipwra’, ‘psm’.
alpha (float, default 0.05) – Significance level for hypothesis tests.
n_bootstrap (int, default 0) – Number of bootstrap replications for SE. If 0, use analytical SE.
never_treated_values (list, optional) – Values in gvar indicating never-treated units.
rolling (str, default 'demean') – Rolling transformation method: ‘demean’ or ‘detrend’. Demeaning subtracts pre-treatment means; detrending removes unit-specific linear trends.
verbose (bool, default True) – Whether to print summary.
- Returns:
Test results including pre-treatment estimates, joint test, and method recommendation.
- Return type:
Notes
The testing procedure:
Apply rolling transformation (demeaning or detrending) to pre-treatment periods using future pre-treatment periods as the baseline.
Estimate ATT for each pre-treatment event time using the transformed outcomes.
Under H0 (parallel trends), all pre-treatment ATTs should be zero.
Perform joint F-test for H0: all pre-treatment ATT = 0.
The anchor point (event time e = -1) is set to 0 by construction and excluded from testing.
See also
diagnose_heterogeneous_trendsDiagnose trend heterogeneity.
recommend_transformationGet method recommendation.
diagnose_heterogeneous_trends
- lwdid.trend_diagnostics.diagnose_heterogeneous_trends(data, y, ivar, tvar, gvar=None, controls=None, never_treated_values=None, include_control_group=True, alpha=0.05, verbose=True)[source]
Diagnose heterogeneous trends across treatment cohorts.
Estimates pre-treatment linear trends for each cohort and tests whether trends differ significantly. Under the conditional heterogeneous trends (CHT) assumption, different cohorts may have different linear trends, which can be removed by detrending.
- Parameters:
data (pd.DataFrame) – Panel data in long format.
y (str) – Outcome variable column name.
ivar (str) – Unit identifier column name.
tvar (str) – Time variable column name.
gvar (str, optional) – Cohort variable. If None, assumes common timing.
controls (list of str, optional) – Control variable column names.
never_treated_values (list, optional) – Values indicating never-treated units. Default: [0, np.inf].
include_control_group (bool, default True) – Whether to include never-treated group in trend analysis.
alpha (float, default 0.05) – Significance level for tests.
verbose (bool, default True) – Whether to print summary.
- Returns:
Diagnostic results including cohort trends, heterogeneity test, and method recommendation.
- Return type:
Notes
Under the CHT framework, the expected outcome in the never-treated state includes cohort-specific linear time trends. Each cohort g has its own trend coefficient, allowing for differential pre-treatment trajectories across cohorts.
The heterogeneity test uses an F-test for the null hypothesis that all cohort trends are equal. Rejection suggests detrending may be appropriate to remove cohort-specific trends.
See also
test_parallel_trendsTest parallel trends assumption.
recommend_transformationGet method recommendation.
recommend_transformation
- lwdid.trend_diagnostics.recommend_transformation(data, y, ivar, tvar, gvar=None, controls=None, never_treated_values=None, run_all_diagnostics=True, verbose=True)[source]
Recommend optimal transformation method based on data characteristics.
Combines multiple diagnostic procedures to provide an informed recommendation on whether to use demean, detrend, or seasonal variants.
- Parameters:
data (pd.DataFrame) – Panel data in long format.
y (str) – Outcome variable column name.
ivar (str) – Unit identifier column name.
tvar (str) – Time variable column name.
gvar (str, optional) – Cohort variable for staggered designs.
controls (list of str, optional) – Control variable column names.
never_treated_values (list, optional) – Values indicating never-treated units.
run_all_diagnostics (bool, default True) – Whether to run all diagnostic tests. If False, uses heuristics only.
verbose (bool, default True) – Whether to print summary.
- Returns:
Recommendation with confidence level and supporting diagnostics.
- Return type:
Notes
The recommendation algorithm considers:
Data requirements: detrend requires at least 2 pre-treatment periods
Parallel trends test: If violated, recommend detrend
Trend heterogeneity: If detected, recommend detrend
Panel balance: Unbalanced panels favor detrend
Seasonal patterns: If detected, recommend seasonal variants
Decision tree:
n_pre_periods < 2? |-- Yes -> demean (detrend not feasible) +-- No -> Run diagnostics |-- PT violated OR heterogeneous trends? | |-- Yes -> detrend | +-- No -> demean (more efficient) +-- Seasonal patterns? |-- Yes -> demeanq/detrendq +-- No -> demean/detrendSee also
test_parallel_trendsTest parallel trends assumption.
diagnose_heterogeneous_trendsDiagnose trend heterogeneity.
Visualization
plot_cohort_trends
- lwdid.trend_diagnostics.plot_cohort_trends(data, y, ivar, tvar, gvar, controls=None, never_treated_values=None, normalize=True, normalize_period=None, show_treatment_lines=True, show_trend_lines=True, confidence_bands=True, alpha=0.05, figsize=(12, 8), ax=None)[source]
Visualize outcome trends by treatment cohort.
Creates a plot showing average outcome trajectories for each cohort, with optional trend lines and treatment timing indicators.
- Parameters:
data (pd.DataFrame) – Panel data in long format.
y (str) – Outcome variable column name.
ivar (str) – Unit identifier column name.
tvar (str) – Time variable column name.
gvar (str) – Cohort variable indicating first treatment period.
controls (list of str, optional) – Control variables for residualized outcomes.
never_treated_values (list, optional) – Values indicating never-treated units.
normalize (bool, default True) – Whether to normalize outcomes (subtract baseline).
normalize_period (int, optional) – Period to use as baseline. Default: period before first treatment.
show_treatment_lines (bool, default True) – Whether to show vertical lines at treatment timing.
show_trend_lines (bool, default True) – Whether to show fitted linear trend lines.
confidence_bands (bool, default True) – Whether to show confidence bands around means.
alpha (float, default 0.05) – Significance level for confidence bands.
figsize (tuple, default (12, 8)) – Figure size in inches.
ax (matplotlib.axes.Axes, optional) – Axes to plot on. If None, creates new figure.
- Returns:
Figure containing the cohort trends plot.
- Return type:
Notes
This visualization helps assess:
Whether pre-treatment trends are parallel across cohorts
Whether treatment effects appear at the expected timing
Whether trends are approximately linear (for detrending)
Enumerations
TrendTestMethod
TransformationMethod
RecommendationConfidence
Result Classes
ParallelTrendsTestResult
- class lwdid.trend_diagnostics.ParallelTrendsTestResult(method, reject_null, pvalue, test_statistic, pre_trend_estimates, joint_f_stat=None, joint_pvalue=None, joint_df=(0, 0), recommendation='demean', recommendation_reason='', figure=None, warnings=<factory>)[source]
Results from testing the parallel trends assumption.
Aggregates pre-treatment ATT estimates and joint test statistics to assess whether the parallel trends assumption is likely to hold. Includes a method recommendation based on the test outcome.
- Variables:
method (TrendTestMethod) – Testing method used (placebo, regression, visual, or joint).
reject_null (bool) – Whether to reject H0 that parallel trends holds.
pvalue (float) – P-value for the overall test.
test_statistic (float) – Test statistic value (F-statistic for joint test).
pre_trend_estimates (list of PreTrendEstimate) – Pre-treatment ATT estimates by event time.
joint_f_stat (float or None) – F-statistic for the joint test H0: all pre-ATT = 0.
joint_pvalue (float or None) – P-value for the joint F-test.
joint_df (tuple of int) – Degrees of freedom (numerator, denominator) for the F-test.
recommendation (str) – Recommended transformation method based on test results.
recommendation_reason (str) – Explanation for the recommendation.
figure (Any or None) – Pre-trends visualization figure object.
warnings (list of str) – Warning messages from the testing procedure.
- method: TrendTestMethod
- pre_trend_estimates: list[PreTrendEstimate]
HeterogeneousTrendsDiagnostics
- class lwdid.trend_diagnostics.HeterogeneousTrendsDiagnostics(trend_by_cohort, trend_heterogeneity_test, trend_differences, control_group_trend, has_heterogeneous_trends, recommendation, recommendation_confidence, recommendation_reason, figure=None)[source]
Diagnostic results for heterogeneous trends across cohorts.
Tests whether different treatment cohorts have different pre-treatment linear trends. Significant heterogeneity violates the standard parallel trends assumption but may be accommodated by detrending to remove cohort-specific trends under the CHT framework.
- Variables:
trend_by_cohort (list of CohortTrendEstimate) – Estimated linear trends for each treatment cohort.
trend_heterogeneity_test (dict) – Results of the F-test for overall trend heterogeneity. Keys: ‘f_stat’, ‘pvalue’, ‘df_num’, ‘df_den’, ‘reject_null’.
trend_differences (list of TrendDifference) – Pairwise trend differences between all cohort pairs.
control_group_trend (CohortTrendEstimate or None) – Trend estimate for the never-treated control group.
has_heterogeneous_trends (bool) – Whether significant trend heterogeneity is detected.
recommendation (str) – Recommended transformation method based on diagnostics.
recommendation_confidence (float) – Confidence score for the recommendation (0 to 1).
recommendation_reason (str) – Explanation for the recommendation.
figure (Any or None) – Trend comparison visualization figure object.
- trend_by_cohort: list[CohortTrendEstimate]
- trend_differences: list[TrendDifference]
- control_group_trend: CohortTrendEstimate | None
TransformationRecommendation
- class lwdid.trend_diagnostics.TransformationRecommendation(recommended_method, confidence, confidence_level, reasons, parallel_trends_test=None, heterogeneous_trends_diag=None, n_pre_periods_min=0, n_pre_periods_max=0, has_seasonal_pattern=False, is_balanced_panel=True, alternative_method=None, alternative_reason=None, warnings=<factory>)[source]
Comprehensive recommendation for transformation method selection.
Combines parallel trends test results, heterogeneous trends diagnostics, and data characteristics to provide an informed recommendation on whether to use demean, detrend, or their seasonal variants.
- Variables:
recommended_method (str) – Primary recommendation: ‘demean’, ‘detrend’, ‘demeanq’, or ‘detrendq’.
confidence (float) – Confidence score for the recommendation (0 to 1).
confidence_level (RecommendationConfidence) – Categorical confidence level (HIGH, MEDIUM, or LOW).
reasons (list of str) – List of reasons supporting the recommendation.
parallel_trends_test (ParallelTrendsTestResult or None) – Results from the parallel trends test, if performed.
heterogeneous_trends_diag (HeterogeneousTrendsDiagnostics or None) – Results from heterogeneous trends diagnostics, if performed.
n_pre_periods_min (int) – Minimum number of pre-treatment periods across cohorts.
n_pre_periods_max (int) – Maximum number of pre-treatment periods across cohorts.
has_seasonal_pattern (bool) – Whether seasonal patterns are detected in the outcome.
is_balanced_panel (bool) – Whether the panel is balanced.
alternative_method (str or None) – Alternative recommendation if primary is not suitable.
alternative_reason (str or None) – Explanation for the alternative recommendation.
warnings (list of str) – Warning messages about data limitations or method constraints.
- confidence_level: RecommendationConfidence
- parallel_trends_test: ParallelTrendsTestResult | None = None
- heterogeneous_trends_diag: HeterogeneousTrendsDiagnostics | None = None
Supporting Data Classes
PreTrendEstimate
- class lwdid.trend_diagnostics.PreTrendEstimate(event_time, cohort, att, se, t_stat, pvalue, ci_lower, ci_upper, n_treated, n_control, df=0)[source]
Pre-treatment ATT estimate for a single event time.
Stores the estimated treatment effect for a pre-treatment period, used for placebo tests and parallel trends assessment. Under the null hypothesis of parallel trends, these estimates should be statistically indistinguishable from zero.
- Variables:
event_time (int) – Event time relative to treatment onset (negative for pre-treatment).
cohort (int or None) – Treatment cohort identifier for staggered adoption designs.
att (float) – Estimated average treatment effect on the treated.
se (float) – Standard error of the ATT estimate.
t_stat (float) – t-statistic computed as att / se.
pvalue (float) – Two-sided p-value for testing H0: ATT = 0.
ci_lower (float) – Lower bound of the confidence interval.
ci_upper (float) – Upper bound of the confidence interval.
n_treated (int) – Number of treated units in the estimation sample.
n_control (int) – Number of control units in the estimation sample.
df (int) – Degrees of freedom for t-distribution inference.
CohortTrendEstimate
- class lwdid.trend_diagnostics.CohortTrendEstimate(cohort, intercept, intercept_se, slope, slope_se, slope_pvalue, n_units, n_pre_periods, r_squared, residual_std=0.0)[source]
Estimated linear trend for a single treatment cohort.
Stores the pre-treatment linear trend estimate for a cohort, obtained by regressing outcomes on time for pre-treatment periods. Significant slopes indicate the presence of cohort-specific trends.
- Variables:
cohort (int) – Treatment cohort identifier (first treatment period).
intercept (float) – Estimated intercept of the trend regression.
intercept_se (float) – Standard error of the intercept estimate.
slope (float) – Estimated slope representing the linear time trend.
slope_se (float) – Standard error of the slope estimate.
slope_pvalue (float) – Two-sided p-value for testing H0: slope = 0.
n_units (int) – Number of units in this cohort.
n_pre_periods (int) – Number of pre-treatment periods used in estimation.
r_squared (float) – Coefficient of determination for the trend regression.
residual_std (float) – Standard deviation of regression residuals.
TrendDifference
- class lwdid.trend_diagnostics.TrendDifference(cohort_1, cohort_2, slope_1, slope_2, slope_diff, slope_diff_se, t_stat, pvalue, df)[source]
Pairwise difference in linear trends between two cohorts.
Stores the result of testing whether two cohorts have equal pre-treatment trends. Under the parallel trends assumption, all pairwise differences should be statistically indistinguishable from zero.
- Variables:
cohort_1 (int) – First cohort identifier.
cohort_2 (int) – Second cohort identifier.
slope_1 (float) – Estimated slope for the first cohort.
slope_2 (float) – Estimated slope for the second cohort.
slope_diff (float) – Difference in slopes (slope_1 - slope_2).
slope_diff_se (float) – Standard error of the slope difference.
t_stat (float) – t-statistic for testing equal slopes.
pvalue (float) – Two-sided p-value for H0: slope_1 = slope_2.
df (int) – Degrees of freedom for the test.
Usage Examples
Testing Parallel Trends
import lwdid
import pandas as pd
# Test parallel trends with placebo method
pt_result = lwdid.test_parallel_trends(
data=df,
y='outcome',
ivar='unit_id',
tvar='year',
gvar='first_treat',
method='placebo',
alpha=0.05
)
# Display summary
print(pt_result.summary())
# Check recommendation
if pt_result.reject_null:
print("Parallel trends rejected, consider detrending")
else:
print("Parallel trends not rejected, demeaning appropriate")
Diagnosing Heterogeneous Trends
# Diagnose trend heterogeneity across cohorts
ht_diag = lwdid.diagnose_heterogeneous_trends(
data=df,
y='outcome',
ivar='unit_id',
tvar='year',
gvar='first_treat'
)
# Display summary
print(ht_diag.summary())
# Check if heterogeneous trends detected
if ht_diag.has_heterogeneous_trends:
print(f"Recommendation: {ht_diag.recommendation}")
Getting Transformation Recommendation
# Get automated transformation recommendation
rec = lwdid.recommend_transformation(
data=df,
y='outcome',
ivar='unit_id',
tvar='year',
gvar='first_treat',
verbose=True
)
print(f"Recommended method: {rec.recommended_method}")
print(f"Confidence: {rec.confidence:.1%}")
for reason in rec.reasons:
print(f" - {reason}")
Visualizing Cohort Trends
# Plot outcome trajectories by cohort
fig = lwdid.plot_cohort_trends(
data=df,
y='outcome',
ivar='unit_id',
tvar='year',
gvar='first_treat',
normalize=True
)
fig.savefig('cohort_trends.png')
Methodological Background
The parallel trends assumption requires that, in the absence of treatment, treated and control units would have followed similar outcome trajectories. Formally, for all post-treatment periods \(t \geq S\):
When this assumption fails but the conditional heterogeneous trends (CHT) assumption holds, detrending can restore identification. Under CHT, each cohort \(g\) may have its own linear trend \(\eta_g\), but the detrended outcomes satisfy:
where \(\ddot{Y}_{ir}\) denotes the detrended outcome.
Decision Framework
The recommended workflow for transformation selection:
Run
test_parallel_trends()withmethod='placebo'If parallel trends not rejected: use
rolling='demean'If parallel trends rejected: run
diagnose_heterogeneous_trends()If heterogeneous trends detected: use
rolling='detrend'Use
recommend_transformation()for automated guidance