Sensitivity Analysis

This module provides tools for assessing the robustness of difference-in-differences estimates to specification choices, implementing the recommendations from Lee and Wooldridge (2026) and Lee and Wooldridge (2025).

Main Functions

lwdid.sensitivity.robustness_pre_periods(data, y, ivar, tvar, gvar=None, d=None, post=None, rolling='demean', estimator='ra', controls=None, vce=None, cluster_var=None, pre_period_range=None, step=1, exclude_periods_before_treatment=0, robustness_threshold=0.25, alpha=0.05, verbose=True)[source]

Assess robustness of ATT estimates to pre-treatment period selection.

Tests how ATT estimates vary when using different numbers of pre-treatment periods, allowing researchers to assess whether findings are robust to this methodological choice.

Parameters:

data (pd.DataFrame) – Panel data in long format.
y (str) – Outcome variable column name.
ivar (str) – Unit identifier column name.
tvar (str) – Time variable column name.
gvar (str, optional) – Cohort variable for staggered designs.
d (str, optional) – Treatment indicator for common timing.
post (str, optional) – Post-treatment indicator for common timing.
rolling ({'demean', 'detrend'}, default 'demean') – Transformation method.
estimator ({'ra', 'ipw', 'ipwra', 'psm'}, default 'ra') – Estimation method.
controls (list of str, optional) – Control variable column names.
vce (str, optional) – Variance estimator type.
cluster_var (str, optional) – Cluster variable for clustered SE.
pre_period_range (tuple of (int, int), optional) – Range of pre-treatment periods to test (min_periods, max_periods). If None, automatically determined from data.
step (int, default 1) – Step size for varying pre-treatment periods.
exclude_periods_before_treatment (int, default 0) – Number of periods to exclude immediately before treatment. Useful for testing robustness to no-anticipation violations.
robustness_threshold (float, default 0.25) – Threshold for robustness determination. Results are considered robust if sensitivity_ratio < robustness_threshold.
alpha (float, default 0.05) – Significance level for confidence intervals.
verbose (bool, default True) – Whether to print progress and summary.

Returns:

Results containing: - specifications: ATT estimates for each pre-period count - sensitivity_ratio: Range of ATT estimates relative to baseline - is_robust: Whether estimates are stable across specifications - recommendation: Interpretation and recommendations - figure: Sensitivity plot (if plot() called)

Return type:

PrePeriodRobustnessResult

Notes

The function varies the starting point of pre-treatment data and re-estimates ATT for each specification, allowing researchers to assess how sensitive their findings are to this methodological choice.

In many applications, the policy intervention may be based on past outcomes. This analysis helps determine whether sufficient pre-treatment periods are being used to adequately control for selection into treatment.

Robustness levels based on sensitivity ratio: - < 10%: Highly robust - 10-25%: Moderately robust - 25-50%: Sensitive - >= 50%: Highly sensitive

Result Classes

class lwdid.sensitivity.PrePeriodRobustnessResult(specifications, baseline_spec, att_range, att_mean, att_std, sensitivity_ratio, robustness_level, is_robust, robustness_threshold, all_same_sign, all_significant, n_significant, n_sign_changes, rolling_method, estimator, n_specifications, pre_period_range_tested, recommendation, detailed_recommendations=<factory>, result_warnings=<factory>, figure=None)[source]

Bases: object

Result of pre-treatment period robustness analysis.

Assesses how ATT estimates vary when using different numbers of pre-treatment periods, helping identify whether findings are robust to this methodological choice.

Variables:

specifications (list[SpecificationResult]) – ATT estimates for each pre-period configuration.
baseline_spec (SpecificationResult) – Estimate using all available pre-treatment periods.
att_range (tuple[float, float]) – (min ATT, max ATT) across all specifications.
att_mean (float) – Mean ATT across specifications.
att_std (float) – Standard deviation of ATT across specifications.
sensitivity_ratio (float) – Ratio of range to baseline: (max - min) / abs(baseline).
robustness_level (RobustnessLevel) – Categorical assessment of robustness.
is_robust (bool) – Whether estimates are stable (ratio < threshold).
robustness_threshold (float) – Threshold used for robustness determination.
all_same_sign (bool) – Whether all estimates have the same sign.
all_significant (bool) – Whether all estimates are significant at 5%.
n_significant (int) – Number of significant specifications.
n_sign_changes (int) – Number of specifications with sign different from baseline.
rolling_method (str) – Transformation method used.
estimator (str) – Estimation method used.
n_specifications (int) – Total number of specifications tested.
pre_period_range_tested (tuple[int, int]) – Range of pre-periods tested (min, max).
recommendation (str) – Main recommendation based on analysis.
detailed_recommendations (list[str]) – Detailed recommendations.
result_warnings (list[str]) – Warning messages.
figure (Any | None) – Matplotlib figure if plot was generated.

specifications: list[SpecificationResult]

baseline_spec: SpecificationResult

att_range: tuple[float, float]

att_mean: float

att_std: float

sensitivity_ratio: float

robustness_level: RobustnessLevel

is_robust: bool

robustness_threshold: float

all_same_sign: bool

all_significant: bool

n_significant: int

n_sign_changes: int

rolling_method: str

estimator: str

n_specifications: int

pre_period_range_tested: tuple[int, int]

recommendation: str

detailed_recommendations: list[str]

result_warnings: list[str]

figure: Any | None = None

to_dataframe()[source]

Convert all specification results to a pandas DataFrame.

Returns:: DataFrame with one row per specification containing ATT estimates, standard errors, p-values, and other diagnostic information.
Return type:: pd.DataFrame

get_specification(n_pre)[source]

Retrieve specification result for a specific pre-period count.

Parameters:: n_pre (int) – Number of pre-treatment periods to look up.
Returns:: The specification result if found, None otherwise.
Return type:: SpecificationResult or None

summary()[source]

Generate a comprehensive human-readable summary report.

Returns:: Formatted text report containing configuration, baseline estimates, sensitivity metrics, robustness assessment, and recommendations.
Return type:: str

plot(show_ci=True, show_baseline=True, figsize=(10, 6), ax=None)[source]

Generate sensitivity plot.

Shows ATT estimates across different pre-period specifications with confidence intervals.

Parameters:

show_ci (bool, default True) – Whether to show confidence intervals.
show_baseline (bool, default True) – Whether to show baseline reference line.
figsize (tuple, default (10, 6)) – Figure size in inches.
ax (matplotlib.axes.Axes, optional) – Axes to plot on. If None, creates new figure.

Returns:

The generated figure.

Return type:

matplotlib.figure.Figure

class lwdid.sensitivity.NoAnticipationSensitivityResult(estimates, baseline_estimate, anticipation_detected, recommended_exclusion, detection_method, recommendation, result_warnings=<factory>, figure=None)[source]

Bases: object

Result of no-anticipation sensitivity analysis.

Tests robustness of ATT estimates to potential anticipation effects by excluding periods immediately before treatment.

Variables:

estimates (list[AnticipationEstimate]) – ATT estimates for each exclusion configuration.
baseline_estimate (AnticipationEstimate) – Estimate with no exclusion (excluded_periods=0).
anticipation_detected (bool) – Whether anticipation effects are detected.
recommended_exclusion (int) – Recommended number of periods to exclude.
detection_method (AnticipationDetectionMethod) – Method used to detect anticipation.
recommendation (str) – Interpretation and recommendations.
result_warnings (list[str]) – Warning messages.
figure (Any | None) – Matplotlib figure if plot was generated.

estimates: list[AnticipationEstimate]

baseline_estimate: AnticipationEstimate

anticipation_detected: bool

recommended_exclusion: int

detection_method: AnticipationDetectionMethod

recommendation: str

result_warnings: list[str]

figure: Any | None = None

to_dataframe()[source]

Convert all anticipation estimates to a pandas DataFrame.

Returns:: DataFrame with one row per exclusion level containing ATT estimates, standard errors, p-values, and significance indicators.
Return type:: pd.DataFrame

summary()[source]

Generate a human-readable summary of the anticipation analysis.

Returns:: Formatted text report containing estimates by exclusion level, detection results, and recommendations.
Return type:: str

plot(show_ci=True, figsize=(10, 6), ax=None)[source]

Generate anticipation sensitivity plot.

Parameters:

show_ci (bool, default True) – Whether to show confidence intervals.
figsize (tuple, default (10, 6)) – Figure size in inches.
ax (matplotlib.axes.Axes, optional) – Axes to plot on.

Returns:

The generated figure.

Return type:

matplotlib.figure.Figure

class lwdid.sensitivity.ComprehensiveSensitivityResult(pre_period_result=None, anticipation_result=None, transformation_comparison=None, estimator_comparison=None, overall_assessment='', recommendations=<factory>)[source]

Bases: object

Combined results from comprehensive sensitivity analysis.

Variables:

pre_period_result (PrePeriodRobustnessResult | None) – Results from pre-period robustness analysis.
anticipation_result (NoAnticipationSensitivityResult | None) – Results from no-anticipation sensitivity analysis.
transformation_comparison (dict | None) – Comparison of demean vs detrend results.
estimator_comparison (dict | None) – Comparison across different estimators.
overall_assessment (str) – Overall robustness assessment.
recommendations (list[str]) – List of recommendations.

pre_period_result: PrePeriodRobustnessResult | None = None

anticipation_result: NoAnticipationSensitivityResult | None = None

transformation_comparison: dict | None = None

estimator_comparison: dict | None = None

overall_assessment: str = ''

recommendations: list[str]

summary()[source]

Generate a comprehensive summary of all sensitivity analyses.

Returns:: Formatted text report containing results from pre-period robustness, anticipation sensitivity, transformation comparison, estimator comparison, overall assessment, and recommendations.
Return type:: str

plot_all(figsize=(14, 10))[source]

Generate combined visualization of all sensitivity analyses.

Parameters:: figsize (tuple of float, default (14, 10)) – Figure size in inches (width, height).
Returns:: Combined figure with subplots for each available analysis, or None if no results are available to plot.
Return type:: matplotlib.figure.Figure or None

class lwdid.sensitivity.SpecificationResult(specification_id, n_pre_periods, start_period, end_period, excluded_periods, att, se, t_stat, pvalue, ci_lower, ci_upper, n_treated, n_control, df, converged=True, spec_warnings=<factory>)[source]

Bases: object

Result from a single specification in sensitivity analysis.

Represents one point in the sensitivity analysis, corresponding to a specific configuration of pre-treatment periods.

Variables:

specification_id (int) – Unique identifier for this specification.
n_pre_periods (int) – Number of pre-treatment periods used.
start_period (int) – First pre-treatment period included.
end_period (int) – Last pre-treatment period included.
excluded_periods (int) – Number of periods excluded before treatment.
att (float) – Average treatment effect on the treated.
se (float) – Standard error of ATT.
t_stat (float) – t-statistic for H0: ATT=0.
pvalue (float) – Two-sided p-value.
ci_lower (float) – Lower bound of confidence interval.
ci_upper (float) – Upper bound of confidence interval.
n_treated (int) – Number of treated units.
n_control (int) – Number of control units.
df (int) – Degrees of freedom for inference.
converged (bool) – Whether estimation converged successfully.
spec_warnings (list[str]) – Warning messages from estimation.

specification_id: int

n_pre_periods: int

start_period: int

end_period: int

excluded_periods: int

att: float

se: float

t_stat: float

pvalue: float

ci_lower: float

ci_upper: float

n_treated: int

n_control: int

df: int

converged: bool = True

spec_warnings: list[str]

property is_significant_05: bool: Whether estimate is significant at 5% level.

property is_significant_10: bool: Whether estimate is significant at 10% level.

to_dict()[source]

Convert specification result to dictionary for DataFrame construction.

Returns:: Dictionary containing all specification attributes suitable for constructing a pandas DataFrame row.
Return type:: dict

class lwdid.sensitivity.AnticipationEstimate(excluded_periods, att, se, t_stat, pvalue, ci_lower, ci_upper, n_pre_periods_used)[source]

Bases: object

ATT estimate with specific anticipation exclusion.

Variables:

excluded_periods (int) – Number of periods excluded before treatment.
att (float) – Average treatment effect on the treated.
se (float) – Standard error of ATT.
t_stat (float) – t-statistic for H0: ATT=0.
pvalue (float) – Two-sided p-value.
ci_lower (float) – Lower bound of confidence interval.
ci_upper (float) – Upper bound of confidence interval.
n_pre_periods_used (int) – Number of pre-treatment periods actually used.

excluded_periods: int

att: float

se: float

t_stat: float

pvalue: float

ci_lower: float

ci_upper: float

n_pre_periods_used: int

property is_significant: bool: Whether estimate is significant at 5% level.

to_dict()[source]

Convert anticipation estimate to dictionary.

Returns:: Dictionary containing all estimate attributes suitable for constructing a pandas DataFrame row.
Return type:: dict

Enumerations

class lwdid.sensitivity.RobustnessLevel(value)[source]

Bases: Enum

Categorical assessment of estimate stability across specifications.

The robustness level is determined by the sensitivity ratio, which measures the range of ATT estimates relative to the baseline estimate magnitude.

Variables:

HIGHLY_ROBUST (str) – Sensitivity ratio below 10%. Estimates are very stable.
MODERATELY_ROBUST (str) – Sensitivity ratio between 10% and 25%. Estimates show minor variation.
SENSITIVE (str) – Sensitivity ratio between 25% and 50%. Estimates vary noticeably.
HIGHLY_SENSITIVE (str) – Sensitivity ratio at or above 50%. Estimates are unstable.

HIGHLY_ROBUST = 'highly_robust'

MODERATELY_ROBUST = 'moderately_robust'

SENSITIVE = 'sensitive'

HIGHLY_SENSITIVE = 'highly_sensitive'

class lwdid.sensitivity.AnticipationDetectionMethod(value)[source]

Bases: Enum

Detection method used to identify potential anticipation effects.

Anticipation effects occur when units adjust behavior before formal treatment begins, violating the no-anticipation assumption.

Variables:

TREND_BREAK (str) – Detected via structural break in pre-treatment trend.
COEFFICIENT_CHANGE (str) – Detected via significant change in ATT when excluding periods.
PLACEBO_TEST (str) – Detected via significant placebo effects in pre-treatment periods.
NONE_DETECTED (str) – No anticipation effects identified by any method.
INSUFFICIENT_DATA (str) – Insufficient pre-treatment periods to perform detection.

TREND_BREAK = 'trend_break'

COEFFICIENT_CHANGE = 'coefficient_change'

PLACEBO_TEST = 'placebo_test'

NONE_DETECTED = 'none_detected'

INSUFFICIENT_DATA = 'insufficient_data'

Usage Examples

Pre-treatment Period Robustness

Assess how estimates change with different numbers of pre-treatment periods:

from lwdid import robustness_pre_periods

result = robustness_pre_periods(
    data,
    y='outcome',
    ivar='unit',
    tvar='year',
    gvar='first_treat',
    rolling='detrend',
    pre_period_range=(3, 8),
    verbose=True
)

# View summary
print(result.summary())

# Visualize results
fig = result.plot()

# Access detailed results
df = result.to_dataframe()
print(f"Sensitivity ratio: {result.sensitivity_ratio:.1%}")
print(f"Robustness level: {result.robustness_level.value}")

No-Anticipation Sensitivity

Test sensitivity to potential anticipation effects:

from lwdid import sensitivity_no_anticipation

result = sensitivity_no_anticipation(
    data,
    y='outcome',
    ivar='unit',
    tvar='year',
    gvar='first_treat',
    max_anticipation=3,
    detection_threshold=0.10
)

if result.anticipation_detected:
    print(f"Anticipation detected!")
    print(f"Recommended exclusion: {result.recommended_exclusion} periods")
else:
    print("No significant anticipation effects detected")

Comprehensive Analysis

Run multiple sensitivity analyses together:

from lwdid import sensitivity_analysis

result = sensitivity_analysis(
    data,
    y='outcome',
    ivar='unit',
    tvar='year',
    gvar='first_treat',
    analyses=['pre_periods', 'anticipation'],
    verbose=True
)

# View comprehensive summary
print(result.summary())

# Plot all analyses
result.plot_all()

Using exclude_pre_periods

Apply sensitivity analysis findings to main estimation:

from lwdid import lwdid

# Based on sensitivity analysis, exclude 2 periods before treatment
result = lwdid(
    data,
    y='outcome',
    d='d',
    ivar='unit',
    tvar='year',
    post='post',
    rolling='demean',
    exclude_pre_periods=2  # Exclude periods immediately before treatment
)

print(result.summary())

Interpreting Results

Sensitivity Ratio Thresholds

The sensitivity ratio measures estimate stability across specifications:

< 10%: Highly robust - estimates are stable
10-25%: Moderately robust - some sensitivity, generally acceptable
25-50%: Sensitive - interpret with caution
≥ 50%: Highly sensitive - results depend heavily on specification

Recommendations

The analysis functions provide automated recommendations based on:

Sensitivity ratio: How much estimates vary
Sign consistency: Whether all estimates have the same sign
Significance consistency: Whether all estimates are statistically significant
Pattern detection: Whether estimates show systematic trends

When estimates are sensitive, consider:

Using rolling='detrend' if heterogeneous trends may be present
Excluding periods with potential anticipation effects
Reporting the range of estimates for transparency
Investigating data quality in specific periods

Sensitivity Analysis

Main Functions

Result Classes

Enumerations

Usage Examples

Pre-treatment Period Robustness

No-Anticipation Sensitivity

Comprehensive Analysis

Using exclude_pre_periods

Interpreting Results

Sensitivity Ratio Thresholds

Recommendations

See Also