Sensitivity Analysis

This module provides tools for assessing the robustness of difference-in-differences estimates to specification choices, implementing the recommendations from Lee and Wooldridge (2026) and Lee and Wooldridge (2025).

Main Functions

lwdid.sensitivity.robustness_pre_periods(data, y, ivar, tvar, gvar=None, d=None, post=None, rolling='demean', estimator='ra', controls=None, vce=None, cluster_var=None, pre_period_range=None, step=1, exclude_periods_before_treatment=0, robustness_threshold=0.25, alpha=0.05, verbose=True)[source]

Assess robustness of ATT estimates to pre-treatment period selection.

Tests how ATT estimates vary when using different numbers of pre-treatment periods, allowing researchers to assess whether findings are robust to this methodological choice.

Parameters:
  • data (pd.DataFrame) – Panel data in long format.

  • y (str) – Outcome variable column name.

  • ivar (str) – Unit identifier column name.

  • tvar (str) – Time variable column name.

  • gvar (str, optional) – Cohort variable for staggered designs.

  • d (str, optional) – Treatment indicator for common timing.

  • post (str, optional) – Post-treatment indicator for common timing.

  • rolling ({'demean', 'detrend'}, default 'demean') – Transformation method.

  • estimator ({'ra', 'ipw', 'ipwra', 'psm'}, default 'ra') – Estimation method.

  • controls (list of str, optional) – Control variable column names.

  • vce (str, optional) – Variance estimator type.

  • cluster_var (str, optional) – Cluster variable for clustered SE.

  • pre_period_range (tuple of (int, int), optional) – Range of pre-treatment periods to test (min_periods, max_periods). If None, automatically determined from data.

  • step (int, default 1) – Step size for varying pre-treatment periods.

  • exclude_periods_before_treatment (int, default 0) – Number of periods to exclude immediately before treatment. Useful for testing robustness to no-anticipation violations.

  • robustness_threshold (float, default 0.25) – Threshold for robustness determination. Results are considered robust if sensitivity_ratio < robustness_threshold.

  • alpha (float, default 0.05) – Significance level for confidence intervals.

  • verbose (bool, default True) – Whether to print progress and summary.

Returns:

Results containing: - specifications: ATT estimates for each pre-period count - sensitivity_ratio: Range of ATT estimates relative to baseline - is_robust: Whether estimates are stable across specifications - recommendation: Interpretation and recommendations - figure: Sensitivity plot (if plot() called)

Return type:

PrePeriodRobustnessResult

Notes

The function varies the starting point of pre-treatment data and re-estimates ATT for each specification, allowing researchers to assess how sensitive their findings are to this methodological choice.

In many applications, the policy intervention may be based on past outcomes. This analysis helps determine whether sufficient pre-treatment periods are being used to adequately control for selection into treatment.

Robustness levels based on sensitivity ratio: - < 10%: Highly robust - 10-25%: Moderately robust - 25-50%: Sensitive - >= 50%: Highly sensitive

See also

lwdid

Main estimation function.

sensitivity_no_anticipation

Test robustness to anticipation effects.

sensitivity_analysis

Comprehensive sensitivity analysis.

lwdid.sensitivity.sensitivity_no_anticipation(data, y, ivar, tvar, gvar=None, d=None, post=None, rolling='demean', estimator='ra', controls=None, vce=None, cluster_var=None, max_anticipation=3, detection_threshold=0.1, alpha=0.05, verbose=True)[source]

Test robustness of ATT estimates to potential anticipation effects.

When the no-anticipation assumption may be violated (e.g., policy announced before implementation), units may adjust behavior before formal treatment. This function tests robustness by excluding periods immediately before treatment from the pre-treatment baseline.

Parameters:
  • data (pd.DataFrame) – Panel data in long format.

  • y (str) – Outcome variable column name.

  • ivar (str) – Unit identifier column name.

  • tvar (str) – Time variable column name.

  • gvar (str, optional) – Cohort variable for staggered designs.

  • d (str, optional) – Treatment indicator for common timing.

  • post (str, optional) – Post-treatment indicator for common timing.

  • rolling ({'demean', 'detrend'}, default 'demean') – Transformation method.

  • estimator ({'ra', 'ipw', 'ipwra', 'psm'}, default 'ra') – Estimation method.

  • controls (list of str, optional) – Control variable column names.

  • vce (str, optional) – Variance estimator type.

  • cluster_var (str, optional) – Cluster variable for clustered SE.

  • max_anticipation (int, default 3) – Maximum number of periods to test for anticipation effects. Tests excluding 0, 1, 2, …, max_anticipation periods.

  • detection_threshold (float, default 0.10) – Threshold for detecting anticipation effects. If relative change in ATT exceeds this threshold, anticipation is detected.

  • alpha (float, default 0.05) – Significance level for confidence intervals.

  • verbose (bool, default True) – Whether to print progress and summary.

Returns:

Results containing: - estimates: ATT estimates for each exclusion count - anticipation_detected: Whether anticipation effects are detected - recommended_exclusion: Recommended number of periods to exclude - figure: Sensitivity plot (if plot() called)

Return type:

NoAnticipationSensitivityResult

Notes

The no-anticipation assumption requires that, prior to the first intervention period for a given treatment cohort, the potential outcomes are the same (on average) as in the never treated state.

If policy is announced k periods before implementation, units may adjust behavior during periods {g-k, …, g-1}. By excluding these periods from the pre-treatment baseline, we can test whether estimates are robust to such anticipation effects.

See also

robustness_pre_periods

General pre-period robustness check.

sensitivity_analysis

Comprehensive sensitivity analysis.

lwdid.sensitivity.sensitivity_analysis(data, y, ivar, tvar, gvar=None, d=None, post=None, rolling='demean', estimator='ra', controls=None, vce=None, cluster_var=None, analyses=None, alpha=0.05, verbose=True)[source]

Perform comprehensive sensitivity analysis for DiD estimation.

Combines multiple robustness checks into a single analysis, providing an overall assessment of estimate reliability across different methodological choices.

Parameters:
  • data (pd.DataFrame) – Panel data in long format.

  • y (str) – Outcome variable column name.

  • ivar (str) – Unit identifier column name.

  • tvar (str) – Time variable column name.

  • gvar (str, optional) – Cohort variable for staggered designs.

  • d (str, optional) – Treatment indicator for common timing.

  • post (str, optional) – Post-treatment indicator for common timing.

  • rolling ({'demean', 'detrend'}, default 'demean') – Primary transformation method.

  • estimator ({'ra', 'ipw', 'ipwra', 'psm'}, default 'ra') – Primary estimation method.

  • controls (list of str, optional) – Control variable column names.

  • vce (str, optional) – Variance estimator type.

  • cluster_var (str, optional) – Cluster variable for clustered SE.

  • analyses (list of str, optional) – Which analyses to run. Default: all. Options: ‘pre_periods’, ‘anticipation’, ‘transformation’, ‘estimator’

  • alpha (float, default 0.05) – Significance level.

  • verbose (bool, default True) – Whether to print progress and summary.

Returns:

Combined results from all sensitivity analyses.

Return type:

ComprehensiveSensitivityResult

Notes

Four types of sensitivity analysis are available:

  1. Pre-periods: Tests stability across different numbers of pre-treatment periods used in the transformation.

  2. Anticipation: Tests robustness to potential anticipation effects by excluding periods immediately before treatment.

  3. Transformation: Compares demean and detrend methods to assess whether heterogeneous trends may be present.

  4. Estimator: Compares RA, IPW, and IPWRA estimators to check robustness to propensity score or outcome model misspecification.

See also

robustness_pre_periods

Pre-period robustness check.

sensitivity_no_anticipation

Anticipation sensitivity check.

lwdid.sensitivity.plot_sensitivity(result, show_ci=True, show_baseline=True, highlight_significant=True, figsize=(10, 6), ax=None)[source]

Visualize sensitivity analysis results.

Creates a plot showing how ATT estimates vary across different specifications, with confidence intervals and significance indicators.

Parameters:
  • result (PrePeriodRobustnessResult or NoAnticipationSensitivityResult) – Result object from sensitivity analysis.

  • show_ci (bool, default True) – Whether to show confidence intervals.

  • show_baseline (bool, default True) – Whether to show baseline reference line.

  • highlight_significant (bool, default True) – Whether to highlight significant estimates.

  • figsize (tuple, default (10, 6)) – Figure size in inches.

  • ax (matplotlib.axes.Axes, optional) – Axes to plot on.

Returns:

The generated figure.

Return type:

matplotlib.figure.Figure

Result Classes

class lwdid.sensitivity.PrePeriodRobustnessResult(specifications, baseline_spec, att_range, att_mean, att_std, sensitivity_ratio, robustness_level, is_robust, robustness_threshold, all_same_sign, all_significant, n_significant, n_sign_changes, rolling_method, estimator, n_specifications, pre_period_range_tested, recommendation, detailed_recommendations=<factory>, result_warnings=<factory>, figure=None)[source]

Bases: object

Result of pre-treatment period robustness analysis.

Assesses how ATT estimates vary when using different numbers of pre-treatment periods, helping identify whether findings are robust to this methodological choice.

Variables:
  • specifications (list[SpecificationResult]) – ATT estimates for each pre-period configuration.

  • baseline_spec (SpecificationResult) – Estimate using all available pre-treatment periods.

  • att_range (tuple[float, float]) – (min ATT, max ATT) across all specifications.

  • att_mean (float) – Mean ATT across specifications.

  • att_std (float) – Standard deviation of ATT across specifications.

  • sensitivity_ratio (float) – Ratio of range to baseline: (max - min) / abs(baseline).

  • robustness_level (RobustnessLevel) – Categorical assessment of robustness.

  • is_robust (bool) – Whether estimates are stable (ratio < threshold).

  • robustness_threshold (float) – Threshold used for robustness determination.

  • all_same_sign (bool) – Whether all estimates have the same sign.

  • all_significant (bool) – Whether all estimates are significant at 5%.

  • n_significant (int) – Number of significant specifications.

  • n_sign_changes (int) – Number of specifications with sign different from baseline.

  • rolling_method (str) – Transformation method used.

  • estimator (str) – Estimation method used.

  • n_specifications (int) – Total number of specifications tested.

  • pre_period_range_tested (tuple[int, int]) – Range of pre-periods tested (min, max).

  • recommendation (str) – Main recommendation based on analysis.

  • detailed_recommendations (list[str]) – Detailed recommendations.

  • result_warnings (list[str]) – Warning messages.

  • figure (Any | None) – Matplotlib figure if plot was generated.

specifications: list[SpecificationResult]
baseline_spec: SpecificationResult
att_range: tuple[float, float]
att_mean: float
att_std: float
sensitivity_ratio: float
robustness_level: RobustnessLevel
is_robust: bool
robustness_threshold: float
all_same_sign: bool
all_significant: bool
n_significant: int
n_sign_changes: int
rolling_method: str
estimator: str
n_specifications: int
pre_period_range_tested: tuple[int, int]
recommendation: str
detailed_recommendations: list[str]
result_warnings: list[str]
figure: Any | None = None
to_dataframe()[source]

Convert all specification results to a pandas DataFrame.

Returns:

DataFrame with one row per specification containing ATT estimates, standard errors, p-values, and other diagnostic information.

Return type:

pd.DataFrame

get_specification(n_pre)[source]

Retrieve specification result for a specific pre-period count.

Parameters:

n_pre (int) – Number of pre-treatment periods to look up.

Returns:

The specification result if found, None otherwise.

Return type:

SpecificationResult or None

summary()[source]

Generate a comprehensive human-readable summary report.

Returns:

Formatted text report containing configuration, baseline estimates, sensitivity metrics, robustness assessment, and recommendations.

Return type:

str

plot(show_ci=True, show_baseline=True, figsize=(10, 6), ax=None)[source]

Generate sensitivity plot.

Shows ATT estimates across different pre-period specifications with confidence intervals.

Parameters:
  • show_ci (bool, default True) – Whether to show confidence intervals.

  • show_baseline (bool, default True) – Whether to show baseline reference line.

  • figsize (tuple, default (10, 6)) – Figure size in inches.

  • ax (matplotlib.axes.Axes, optional) – Axes to plot on. If None, creates new figure.

Returns:

The generated figure.

Return type:

matplotlib.figure.Figure

class lwdid.sensitivity.NoAnticipationSensitivityResult(estimates, baseline_estimate, anticipation_detected, recommended_exclusion, detection_method, recommendation, result_warnings=<factory>, figure=None)[source]

Bases: object

Result of no-anticipation sensitivity analysis.

Tests robustness of ATT estimates to potential anticipation effects by excluding periods immediately before treatment.

Variables:
  • estimates (list[AnticipationEstimate]) – ATT estimates for each exclusion configuration.

  • baseline_estimate (AnticipationEstimate) – Estimate with no exclusion (excluded_periods=0).

  • anticipation_detected (bool) – Whether anticipation effects are detected.

  • recommended_exclusion (int) – Recommended number of periods to exclude.

  • detection_method (AnticipationDetectionMethod) – Method used to detect anticipation.

  • recommendation (str) – Interpretation and recommendations.

  • result_warnings (list[str]) – Warning messages.

  • figure (Any | None) – Matplotlib figure if plot was generated.

estimates: list[AnticipationEstimate]
baseline_estimate: AnticipationEstimate
anticipation_detected: bool
recommended_exclusion: int
detection_method: AnticipationDetectionMethod
recommendation: str
result_warnings: list[str]
figure: Any | None = None
to_dataframe()[source]

Convert all anticipation estimates to a pandas DataFrame.

Returns:

DataFrame with one row per exclusion level containing ATT estimates, standard errors, p-values, and significance indicators.

Return type:

pd.DataFrame

summary()[source]

Generate a human-readable summary of the anticipation analysis.

Returns:

Formatted text report containing estimates by exclusion level, detection results, and recommendations.

Return type:

str

plot(show_ci=True, figsize=(10, 6), ax=None)[source]

Generate anticipation sensitivity plot.

Parameters:
  • show_ci (bool, default True) – Whether to show confidence intervals.

  • figsize (tuple, default (10, 6)) – Figure size in inches.

  • ax (matplotlib.axes.Axes, optional) – Axes to plot on.

Returns:

The generated figure.

Return type:

matplotlib.figure.Figure

class lwdid.sensitivity.ComprehensiveSensitivityResult(pre_period_result=None, anticipation_result=None, transformation_comparison=None, estimator_comparison=None, overall_assessment='', recommendations=<factory>)[source]

Bases: object

Combined results from comprehensive sensitivity analysis.

Variables:
  • pre_period_result (PrePeriodRobustnessResult | None) – Results from pre-period robustness analysis.

  • anticipation_result (NoAnticipationSensitivityResult | None) – Results from no-anticipation sensitivity analysis.

  • transformation_comparison (dict | None) – Comparison of demean vs detrend results.

  • estimator_comparison (dict | None) – Comparison across different estimators.

  • overall_assessment (str) – Overall robustness assessment.

  • recommendations (list[str]) – List of recommendations.

pre_period_result: PrePeriodRobustnessResult | None = None
anticipation_result: NoAnticipationSensitivityResult | None = None
transformation_comparison: dict | None = None
estimator_comparison: dict | None = None
overall_assessment: str = ''
recommendations: list[str]
summary()[source]

Generate a comprehensive summary of all sensitivity analyses.

Returns:

Formatted text report containing results from pre-period robustness, anticipation sensitivity, transformation comparison, estimator comparison, overall assessment, and recommendations.

Return type:

str

plot_all(figsize=(14, 10))[source]

Generate combined visualization of all sensitivity analyses.

Parameters:

figsize (tuple of float, default (14, 10)) – Figure size in inches (width, height).

Returns:

Combined figure with subplots for each available analysis, or None if no results are available to plot.

Return type:

matplotlib.figure.Figure or None

class lwdid.sensitivity.SpecificationResult(specification_id, n_pre_periods, start_period, end_period, excluded_periods, att, se, t_stat, pvalue, ci_lower, ci_upper, n_treated, n_control, df, converged=True, spec_warnings=<factory>)[source]

Bases: object

Result from a single specification in sensitivity analysis.

Represents one point in the sensitivity analysis, corresponding to a specific configuration of pre-treatment periods.

Variables:
  • specification_id (int) – Unique identifier for this specification.

  • n_pre_periods (int) – Number of pre-treatment periods used.

  • start_period (int) – First pre-treatment period included.

  • end_period (int) – Last pre-treatment period included.

  • excluded_periods (int) – Number of periods excluded before treatment.

  • att (float) – Average treatment effect on the treated.

  • se (float) – Standard error of ATT.

  • t_stat (float) – t-statistic for H0: ATT=0.

  • pvalue (float) – Two-sided p-value.

  • ci_lower (float) – Lower bound of confidence interval.

  • ci_upper (float) – Upper bound of confidence interval.

  • n_treated (int) – Number of treated units.

  • n_control (int) – Number of control units.

  • df (int) – Degrees of freedom for inference.

  • converged (bool) – Whether estimation converged successfully.

  • spec_warnings (list[str]) – Warning messages from estimation.

specification_id: int
n_pre_periods: int
start_period: int
end_period: int
excluded_periods: int
att: float
se: float
t_stat: float
pvalue: float
ci_lower: float
ci_upper: float
n_treated: int
n_control: int
df: int
converged: bool = True
spec_warnings: list[str]
property is_significant_05: bool

Whether estimate is significant at 5% level.

property is_significant_10: bool

Whether estimate is significant at 10% level.

to_dict()[source]

Convert specification result to dictionary for DataFrame construction.

Returns:

Dictionary containing all specification attributes suitable for constructing a pandas DataFrame row.

Return type:

dict

class lwdid.sensitivity.AnticipationEstimate(excluded_periods, att, se, t_stat, pvalue, ci_lower, ci_upper, n_pre_periods_used)[source]

Bases: object

ATT estimate with specific anticipation exclusion.

Variables:
  • excluded_periods (int) – Number of periods excluded before treatment.

  • att (float) – Average treatment effect on the treated.

  • se (float) – Standard error of ATT.

  • t_stat (float) – t-statistic for H0: ATT=0.

  • pvalue (float) – Two-sided p-value.

  • ci_lower (float) – Lower bound of confidence interval.

  • ci_upper (float) – Upper bound of confidence interval.

  • n_pre_periods_used (int) – Number of pre-treatment periods actually used.

excluded_periods: int
att: float
se: float
t_stat: float
pvalue: float
ci_lower: float
ci_upper: float
n_pre_periods_used: int
property is_significant: bool

Whether estimate is significant at 5% level.

to_dict()[source]

Convert anticipation estimate to dictionary.

Returns:

Dictionary containing all estimate attributes suitable for constructing a pandas DataFrame row.

Return type:

dict

Enumerations

class lwdid.sensitivity.RobustnessLevel(value)[source]

Bases: Enum

Categorical assessment of estimate stability across specifications.

The robustness level is determined by the sensitivity ratio, which measures the range of ATT estimates relative to the baseline estimate magnitude.

Variables:
  • HIGHLY_ROBUST (str) – Sensitivity ratio below 10%. Estimates are very stable.

  • MODERATELY_ROBUST (str) – Sensitivity ratio between 10% and 25%. Estimates show minor variation.

  • SENSITIVE (str) – Sensitivity ratio between 25% and 50%. Estimates vary noticeably.

  • HIGHLY_SENSITIVE (str) – Sensitivity ratio at or above 50%. Estimates are unstable.

HIGHLY_ROBUST = 'highly_robust'
MODERATELY_ROBUST = 'moderately_robust'
SENSITIVE = 'sensitive'
HIGHLY_SENSITIVE = 'highly_sensitive'
class lwdid.sensitivity.AnticipationDetectionMethod(value)[source]

Bases: Enum

Detection method used to identify potential anticipation effects.

Anticipation effects occur when units adjust behavior before formal treatment begins, violating the no-anticipation assumption.

Variables:
  • TREND_BREAK (str) – Detected via structural break in pre-treatment trend.

  • COEFFICIENT_CHANGE (str) – Detected via significant change in ATT when excluding periods.

  • PLACEBO_TEST (str) – Detected via significant placebo effects in pre-treatment periods.

  • NONE_DETECTED (str) – No anticipation effects identified by any method.

  • INSUFFICIENT_DATA (str) – Insufficient pre-treatment periods to perform detection.

TREND_BREAK = 'trend_break'
COEFFICIENT_CHANGE = 'coefficient_change'
PLACEBO_TEST = 'placebo_test'
NONE_DETECTED = 'none_detected'
INSUFFICIENT_DATA = 'insufficient_data'

Usage Examples

Pre-treatment Period Robustness

Assess how estimates change with different numbers of pre-treatment periods:

from lwdid import robustness_pre_periods

result = robustness_pre_periods(
    data,
    y='outcome',
    ivar='unit',
    tvar='year',
    gvar='first_treat',
    rolling='detrend',
    pre_period_range=(3, 8),
    verbose=True
)

# View summary
print(result.summary())

# Visualize results
fig = result.plot()

# Access detailed results
df = result.to_dataframe()
print(f"Sensitivity ratio: {result.sensitivity_ratio:.1%}")
print(f"Robustness level: {result.robustness_level.value}")

No-Anticipation Sensitivity

Test sensitivity to potential anticipation effects:

from lwdid import sensitivity_no_anticipation

result = sensitivity_no_anticipation(
    data,
    y='outcome',
    ivar='unit',
    tvar='year',
    gvar='first_treat',
    max_anticipation=3,
    detection_threshold=0.10
)

if result.anticipation_detected:
    print(f"Anticipation detected!")
    print(f"Recommended exclusion: {result.recommended_exclusion} periods")
else:
    print("No significant anticipation effects detected")

Comprehensive Analysis

Run multiple sensitivity analyses together:

from lwdid import sensitivity_analysis

result = sensitivity_analysis(
    data,
    y='outcome',
    ivar='unit',
    tvar='year',
    gvar='first_treat',
    analyses=['pre_periods', 'anticipation'],
    verbose=True
)

# View comprehensive summary
print(result.summary())

# Plot all analyses
result.plot_all()

Using exclude_pre_periods

Apply sensitivity analysis findings to main estimation:

from lwdid import lwdid

# Based on sensitivity analysis, exclude 2 periods before treatment
result = lwdid(
    data,
    y='outcome',
    d='d',
    ivar='unit',
    tvar='year',
    post='post',
    rolling='demean',
    exclude_pre_periods=2  # Exclude periods immediately before treatment
)

print(result.summary())

Interpreting Results

Sensitivity Ratio Thresholds

The sensitivity ratio measures estimate stability across specifications:

  • < 10%: Highly robust - estimates are stable

  • 10-25%: Moderately robust - some sensitivity, generally acceptable

  • 25-50%: Sensitive - interpret with caution

  • ≥ 50%: Highly sensitive - results depend heavily on specification

Recommendations

The analysis functions provide automated recommendations based on:

  1. Sensitivity ratio: How much estimates vary

  2. Sign consistency: Whether all estimates have the same sign

  3. Significance consistency: Whether all estimates are statistically significant

  4. Pattern detection: Whether estimates show systematic trends

When estimates are sensitive, consider:

  • Using rolling='detrend' if heterogeneous trends may be present

  • Excluding periods with potential anticipation effects

  • Reporting the range of estimates for transparency

  • Investigating data quality in specific periods

See Also