Exceptions Module (exceptions)

The exceptions module defines custom exception classes for clear error reporting in the lwdid package.

Exception hierarchy for the lwdid package.

Provides a structured exception hierarchy for difference-in-differences estimation. All exceptions inherit from LWDIDError, enabling unified error handling across the package.

The hierarchy covers parameter validation, data insufficiency, time series requirements, randomization inference, visualization, and staggered designs.

exception lwdid.exceptions.LWDIDError[source]

Bases: Exception

Base exception class for all lwdid package errors.

All custom exceptions in the lwdid package inherit from this class, providing a common base for unified error handling.

exception lwdid.exceptions.InvalidParameterError[source]

Bases: LWDIDError

Exception raised when input parameter validation fails.

This is a general exception for invalid parameter values that do not fall into more specific categories. Common triggers include:

Invalid data types for control variables
Cluster variable specified without vce=’cluster’
Other parameter constraint violations

See also

InvalidRollingMethodError: For invalid rolling method values.
InvalidVCETypeError: For invalid variance estimator types.

exception lwdid.exceptions.InvalidRollingMethodError[source]

Bases: InvalidParameterError

Exception raised when the rolling parameter has an invalid value.

The rolling parameter must be one of: ‘demean’, ‘detrend’, ‘demeanq’, or ‘detrendq’. This exception is raised during input validation when an unsupported transformation method is specified.

See also

InvalidParameterError: Parent class for parameter validation errors.

exception lwdid.exceptions.InvalidVCETypeError[source]

Bases: InvalidParameterError

Exception raised when the vce parameter has an invalid value.

The vce (variance-covariance estimator) parameter must be one of: None, ‘robust’, ‘hc0’, ‘hc1’, ‘hc2’, ‘hc3’, ‘hc4’, or ‘cluster’. This exception is raised during estimation when an unsupported variance estimator type is specified.

See also

InvalidParameterError: Parent class for parameter validation errors.

exception lwdid.exceptions.UnbalancedPanelError(message, min_obs, max_obs, n_incomplete_units)[source]

Bases: LWDIDError

Exception raised when balanced panel is required but data is unbalanced.

This exception is raised when balanced_panel='error' is specified and the panel data contains units with different numbers of observations.

Variables:

min_obs (int) – Minimum observations per unit in the panel.
max_obs (int) – Maximum observations per unit in the panel.
n_incomplete_units (int) – Number of units with fewer than max_obs observations.

Notes

Unbalanced panels arise when units have different numbers of observed time periods. Under standard selection assumptions, this is acceptable provided that missingness depends only on time-invariant unit heterogeneity and not on time-varying shocks. Users may want to enforce balanced panels for sensitivity analysis or when the selection mechanism is questionable.

See also

LWDIDError: Base exception class.
diagnose_selection_mechanism: Diagnostic tools for selection bias.

exception lwdid.exceptions.InsufficientDataError[source]

Bases: LWDIDError

Exception raised when sample size is insufficient for estimation.

This is a general exception for data insufficiency issues. More specific subclasses indicate the exact nature of the insufficiency (e.g., no treated units, no control units, insufficient pre-periods).

See also

NoTreatedUnitsError: No units with treatment indicator d=1.
NoControlUnitsError: No units with treatment indicator d=0.
InsufficientPrePeriodsError: Insufficient pre-treatment periods.

exception lwdid.exceptions.NoTreatedUnitsError[source]

Bases: InsufficientDataError

Exception raised when there are no treated units in the data.

Raised when all units have treatment indicator d=0 in the panel or regression sample. At least one treated unit (d=1) is required for difference-in-differences estimation.

exception lwdid.exceptions.NoControlUnitsError[source]

Bases: InsufficientDataError

Exception raised when there are no control units in the data.

Raised when all units have treatment indicator d=1 in the panel or regression sample. At least one control unit (d=0) is required for difference-in-differences estimation.

exception lwdid.exceptions.InsufficientPrePeriodsError(message, cohort=None, available=None, required=None, excluded=None)[source]

Bases: InsufficientDataError

Exception raised when pre-treatment periods are insufficient.

Raised when the number of pre-treatment periods is too small for the chosen rolling transformation. Different methods have different minimum global and unit-level requirements:

demean: Each unit must have at least 1 pre-treatment observation.
detrend: (i) The panel must contain at least 2 pre-treatment periods in total (T0 >= 2); and (ii) Each unit must have at least 2 pre-treatment observations so that a unit-specific linear trend can be estimated.
demeanq: (i) The panel must contain at least 1 pre-treatment period in total (T0 >= 1); (ii) Each unit must have at least 1 pre-treatment observation; and (iii) For each unit, the number of pre-treatment observations must be at least the number of distinct pre-period quarters plus one, n_pre >= (#quarters_pre + 1), to ensure positive degrees of freedom when estimating quarterly fixed effects.
detrendq: (i) The panel must contain at least 2 pre-treatment periods in total (T0 >= 2); (ii) Each unit must have at least 2 pre-treatment observations; and (iii) For each unit, the number of pre-treatment observations must be at least 1 plus the number of distinct pre-period quarters, n_pre >= (1 + #quarters_pre), to avoid rank deficiency when estimating a linear trend with quarterly effects.

This exception is also raised in staggered adoption designs when exclude_pre_periods is specified and the remaining pre-treatment periods are insufficient for the chosen transformation method.

Variables:

cohort (int or None) – The treatment cohort identifier that triggered the error. Only set in staggered adoption mode.
available (int or None) – Number of pre-treatment periods remaining after exclusion. Only set when exclude_pre_periods is used.
required (int or None) – Minimum number of pre-treatment periods required by the transformation method (1 for demean, 2 for detrend, etc.).
excluded (int or None) – Number of periods excluded via exclude_pre_periods parameter.

See also

lwdid.transformations.apply_rolling_transform: Applies rolling transformations and enforces pre-period requirements.

Notes

When the no-anticipation assumption may be violated, excluding periods immediately before treatment from the transformation window can provide robustness. For cohort g with exclude_pre_periods=k, the pre-treatment window becomes {T_min, …, g-1-k} instead of {T_min, …, g-1}.

exception lwdid.exceptions.InsufficientQuarterDiversityError[source]

Bases: InsufficientDataError

Exception raised when quarterly data requirements are not met.

Raised for quarterly transformation methods (demeanq/detrendq) when quarter coverage is insufficient: post-treatment periods contain quarters that do not appear in the pre-treatment period for a given unit, preventing estimation of seasonal effects for those quarters.

See also

lwdid.validation.validate_quarter_coverage: Quarter coverage validation.

exception lwdid.exceptions.TimeDiscontinuityError[source]

Bases: LWDIDError

Exception raised when time series is discontinuous or post variable is non-monotone.

This exception is raised in two scenarios:

Time index discontinuity: The time index has gaps, meaning there are missing periods in the sequence. A continuous time index is required for valid transformation and estimation.
Post variable non-monotonicity: The post-treatment indicator is not monotone non-decreasing in time, suggesting that the policy was reversed or suspended. The method assumes absorbing treatment states.

Both scenarios violate the assumptions required for valid difference-in- differences estimation and will cause estimation to fail or produce incorrect results.

See also

LWDIDError: Base exception class.

exception lwdid.exceptions.MissingRequiredColumnError[source]

Bases: LWDIDError

Exception raised when input DataFrame is missing required columns.

Required columns depend on the function call but typically include:

y: Outcome variable
d: Treatment indicator
ivar: Unit identifier
tvar: Time variable (single column or list of [year, quarter])
post: Post-treatment indicator
controls: Control variables (if specified)

See also

LWDIDError: Base exception class.

exception lwdid.exceptions.RandomizationError[source]

Bases: LWDIDError

Exception raised when randomization inference (RI) fails.

Common causes include invalid number of replications (rireps <= 0), missing required columns in input data, sample size too small for resampling (N < 3), invalid ri_method specification, or insufficient valid draws for reliable inference.

See also

LWDIDError: Base exception class.

exception lwdid.exceptions.VisualizationError[source]

Bases: LWDIDError

Exception raised for visualization-related errors.

Common causes include plot data missing required columns (such as transformed outcome, treatment indicator, or time index) or missing plotting backend (matplotlib not installed).

See also

LWDIDError: Base exception class.

exception lwdid.exceptions.InvalidStaggeredDataError[source]

Bases: LWDIDError

Exception raised when staggered data validation fails.

This exception is raised in the following scenarios:

Invalid gvar values: - Negative values in gvar column - String values instead of numeric types - Values that cannot be interpreted as treatment cohort or never-treated
No valid cohorts: - All units are never-treated (no treated cohorts to estimate effects for) - All gvar values are NaN/0/inf with no positive integers
Inconsistent gvar within unit: - Same unit has different gvar values across time periods (gvar should be time-invariant within unit)

Valid gvar values:

Positive integer: Treatment cohort (first treatment period)
0: Never treated
np.inf: Never treated
NaN/None: Never treated

See also

LWDIDError: Base exception class.
NoNeverTreatedError: Raised when never-treated units are required but absent.

exception lwdid.exceptions.NoNeverTreatedError[source]

Bases: InsufficientDataError

Exception raised when never-treated units are required but absent.

This exception is raised when:

aggregate=’cohort’ is specified but no never-treated units exist
aggregate=’overall’ is specified but no never-treated units exist

Never-treated units are required as control group for cohort and overall effect aggregation because different cohorts use different pre-treatment periods for transformation, and only never-treated units can serve as a consistent reference across cohorts.

For (g,r)-specific effects, not-yet-treated units can serve as controls, so this exception is not raised when aggregate=’none’.

See also

InsufficientDataError: Parent class for data insufficiency errors.
InvalidStaggeredDataError: Raised for other staggered data validation failures.

exception lwdid.exceptions.AggregationError[source]

Bases: LWDIDError

Base exception class for aggregation-related errors.

This is the parent class for all exceptions related to repeated cross-sectional data aggregation. Specific subclasses indicate the exact nature of the aggregation failure.

See also

InvalidAggregationError: Raised when aggregation constraints are violated.
InsufficientCellSizeError: Raised when all cells are below minimum size.

exception lwdid.exceptions.InvalidAggregationError[source]

Bases: AggregationError

Exception raised when aggregation constraints are violated.

This exception is raised in the following scenarios:

Treatment varies within cell: Treatment status is not constant within a (unit, period) cell, violating the assumption that treatment is assigned at the aggregation unit level.
gvar varies within unit: Treatment timing (gvar) is not constant within a unit across all periods, violating the time-invariance assumption.
Duplicate column names: Aggregation would result in duplicate column names in the output.

See also

AggregationError: Parent class for aggregation errors.

exception lwdid.exceptions.InsufficientCellSizeError[source]

Bases: AggregationError

Exception raised when all cells are below minimum size threshold.

This exception is raised when the min_cell_size parameter is specified and all (unit, period) cells have fewer observations than the threshold, resulting in an empty output panel.

See also

AggregationError: Parent class for aggregation errors.

Overview

The package uses custom exceptions to provide informative error messages when data or parameters do not meet requirements. All custom exceptions inherit from LWDIDError, making it easy to catch all package-specific errors.

Exception Hierarchy

LWDIDError (base class)
├── InvalidParameterError
│   ├── InvalidRollingMethodError
│   └── InvalidVCETypeError
├── InsufficientDataError
│   ├── NoTreatedUnitsError
│   ├── NoControlUnitsError
│   ├── InsufficientPrePeriodsError
│   ├── InsufficientQuarterDiversityError
│   └── NoNeverTreatedError
├── InvalidStaggeredDataError
├── TimeDiscontinuityError
├── MissingRequiredColumnError
├── RandomizationError
├── VisualizationError
├── UnbalancedPanelError
└── AggregationError
    ├── InvalidAggregationError
    └── InsufficientCellSizeError

Exception Classes

LWDIDError

Base class for all lwdid exceptions.

Usage:

from lwdid import lwdid
from lwdid.exceptions import LWDIDError

try:
    results = lwdid(data, 'y', 'd', 'unit', 'year', 'post', 'demean')
except LWDIDError as e:
    print(f"LWDID error: {e}")

When raised: Never raised directly; use specific subclasses.

InsufficientDataError

Raised when: Data does not meet minimum sample size or period requirements.

Common causes:

Too few units (N < 3)
Insufficient pre-treatment periods for chosen transformation
No post-treatment periods
Empty dataset after filtering

Example:

from lwdid import lwdid
from lwdid.exceptions import InsufficientDataError

try:
    results = lwdid(data, 'y', 'd', 'unit', 'year', 'post', 'detrend')
except InsufficientDataError as e:
    print(f"Insufficient data: {e}")
    # Use different transformation or collect more data

Typical error messages:

InsufficientPrePeriodsError: Insufficient pre-treatment periods for 'detrend'.
All units must have at least 2 pre-treatment periods (T0 >= 2).
Found units with fewer periods: ['unit_3', 'unit_7']

InsufficientDataError: Sample size too small.
Need at least 3 units for estimation, found 2.

InsufficientDataError: No post-treatment periods found.
The 'post' variable is 0 for all observations.

InvalidParameterError

Raised when: Input parameter validation fails.

Common causes:

Invalid rolling method name (see InvalidRollingMethodError)
Invalid vce option (see InvalidVCETypeError)
cluster_var missing or incompatible when vce='cluster'
Treatment indicator or controls not time-invariant
Non-numeric outcome or control variables
Time variables not convertible to valid numeric year/quarter values

Typical error messages (illustrative):

InvalidParameterError: rolling() must be one of: demean, detrend, demeanq, detrendq. Got: 'invalid_method'

InvalidParameterError: vce='cluster' requires cluster_var parameter to be specified.

InvalidParameterError: Treatment indicator 'd' must be time-invariant (constant within each unit).

InvalidRollingMethodError

Specialized subclass of InvalidParameterError raised when the rolling argument does not match one of the supported transformation methods ('demean', 'detrend', 'demeanq', 'detrendq').

InvalidVCETypeError

Specialized subclass of InvalidParameterError raised when the vce argument is not one of None, 'robust', 'hc0', 'hc1', 'hc2', 'hc3', 'hc4', or 'cluster'.

InvalidStaggeredDataError

Raised when: Staggered adoption data validation fails.

Common causes:

gvar column contains invalid values (negative numbers or non-numeric types)
No valid treatment cohorts (all units are never-treated)
gvar is not time-invariant within units (same unit has different gvar values across time periods)

Valid gvar values:

Positive integer: Treatment cohort (first treatment period)
0: Never treated
np.inf: Never treated
NaN/None: Never treated

Example:

from lwdid import lwdid
from lwdid.exceptions import InvalidStaggeredDataError

try:
    results = lwdid(
        data, y='outcome', ivar='unit', tvar='year',
        gvar='first_treat', rolling='demean'
    )
except InvalidStaggeredDataError as e:
    print(f"Staggered data error: {e}")
    # Check gvar column for invalid values

Typical error messages:

InvalidStaggeredDataError: gvar column contains negative values.

InvalidStaggeredDataError: No valid treatment cohorts found.
All units are never-treated.

InvalidStaggeredDataError: gvar is not time-invariant within unit 'unit_5'.

NoNeverTreatedError

Raised when: Never-treated units are required but absent.

This is a subclass of InsufficientDataError raised in staggered adoption settings when cohort-level or overall aggregation is requested but no never-treated units exist in the data.

Common causes:

aggregate='cohort' specified but no never-treated units
aggregate='overall' specified but no never-treated units

Why never-treated units are required:

For cohort and overall effect aggregation, different cohorts use different pre-treatment periods for transformation. Only never-treated units can serve as a consistent reference across cohorts.

For (g,r)-specific effects with aggregate='none', not-yet-treated units can serve as controls, so this exception is not raised.

Example:

from lwdid import lwdid
from lwdid.exceptions import NoNeverTreatedError

try:
    results = lwdid(
        data, y='outcome', ivar='unit', tvar='year',
        gvar='first_treat', rolling='demean',
        aggregate='overall'  # Requires never-treated units
    )
except NoNeverTreatedError as e:
    print(f"No never-treated units: {e}")
    # Use aggregate='none' or add never-treated units to data

Typical error message:

NoNeverTreatedError: aggregate='overall' requires never-treated units,
but none were found in the data.

Data Validation Errors

Raised as: InvalidParameterError, InsufficientDataError, TimeDiscontinuityError, or MissingRequiredColumnError when input data fail validation checks.

Common causes:

Invalid rolling method name (see InvalidRollingMethodError)
Invalid vce option (see InvalidVCETypeError)
cluster_var missing or incompatible when vce='cluster'
Treatment indicator or controls not time-invariant
Non-numeric outcome or control variables
Time variables not convertible to valid numeric year/quarter values

Typical issues (conceptual):

Singular matrix or near-singular design matrix (perfect or near-perfect collinearity among regressors)
Insufficient variation in key variables (for example, all units have the same treatment status in the regression sample)

Estimation Errors

Raised as: subclasses of LWDIDError (for example InsufficientDataError or InvalidParameterError) when estimation fails due to data or parameter issues. Low-level numerical failures from underlying libraries (for example, singular matrix errors in statsmodels) may instead surface as their native exceptions.

Common causes:

Singular matrix or near-singular design matrix (perfect or near-perfect collinearity among regressors)
Insufficient variation in key variables (for example, all units have the same treatment status in the regression sample)

Example:

from lwdid import lwdid
from lwdid.exceptions import LWDIDError

try:
    results = lwdid(data, 'y', 'd', 'unit', 'year', 'post', 'demean')
except LWDIDError as e:
    print(f"Estimation failed: {e}")
    # Check for perfect collinearity, insufficient variation, or other issues
except Exception as e:
    print(f"Unexpected error: {e}")
    # Handle other errors

Error Handling Best Practices

Catch Specific Exceptions

Catch specific exceptions for targeted error handling:

from lwdid import lwdid
from lwdid.exceptions import (
    InvalidParameterError,
    InsufficientDataError,
    InvalidStaggeredDataError,
    NoNeverTreatedError,
    TimeDiscontinuityError,
    MissingRequiredColumnError,
    RandomizationError,
    VisualizationError,
    UnbalancedPanelError,
    AggregationError,
)

try:
    results = lwdid(data, 'y', 'd', 'unit', 'year', 'post', 'demean')

except MissingRequiredColumnError as e:
    print(f"Missing columns: {e}")
    # Fix data and retry

except TimeDiscontinuityError as e:
    print(f"Time structure issue: {e}")
    # Fix time index or post indicator

except InvalidStaggeredDataError as e:
    print(f"Staggered data error: {e}")
    # Check gvar column for valid values

except NoNeverTreatedError as e:
    print(f"No never-treated units: {e}")
    # Use aggregate='none' or add never-treated units

except InsufficientDataError as e:
    print(f"Not enough data: {e}")
    # Use different method or collect more data

except InvalidParameterError as e:
    print(f"Parameter error: {e}")
    # Fix parameters and retry

except RandomizationError as e:
    print(f"Randomization inference failed: {e}")

except VisualizationError as e:
    print(f"Plotting failed: {e}")

Catch All Package Errors

Use LWDIDError to catch all package-specific errors:

from lwdid import lwdid
from lwdid.exceptions import LWDIDError

try:
    results = lwdid(data, 'y', 'd', 'unit', 'year', 'post', 'demean')
except LWDIDError as e:
    print(f"LWDID error: {e}")
    # Handle any package error
except Exception as e:
    print(f"Unexpected error: {e}")
    # Handle other errors

Logging Errors

Log errors for debugging:

import logging
from lwdid import lwdid
from lwdid.exceptions import LWDIDError

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

try:
    results = lwdid(data, 'y', 'd', 'unit', 'year', 'post', 'demean')
except LWDIDError as e:
    logger.error(f"LWDID estimation failed: {e}", exc_info=True)
    raise

Graceful Degradation

Try alternative specifications when estimation fails:

from lwdid import lwdid
from lwdid.exceptions import InsufficientDataError

# Try detrend first
try:
    results = lwdid(data, 'y', 'd', 'unit', 'year', 'post', 'detrend')
    print("Using detrend transformation")

except InsufficientDataError:
    # Fall back to demean if insufficient pre-treatment periods
    results = lwdid(data, 'y', 'd', 'unit', 'year', 'post', 'demean')
    print("Insufficient data for detrend, using demean instead")

Common Error Scenarios

Scenario 1: Insufficient Pre-Treatment Periods

Error:

InsufficientPrePeriodsError: Insufficient pre-treatment periods for 'detrend'.

Diagnosis:

# Check pre-treatment periods by unit
pre_periods = data[data['post'] == 0].groupby('unit').size()
print(pre_periods[pre_periods < 2])

Solution:

# Option 1: Use demean instead
results = lwdid(data, 'y', 'd', 'unit', 'year', 'post', 'demean')

# Option 2: Drop units with insufficient periods
units_ok = pre_periods[pre_periods >= 2].index
data = data[data['unit'].isin(units_ok)]

Scenario 2: Time-Varying Controls

Error:

InvalidParameterError: Control variable 'income' must be time-invariant.

Diagnosis:

# Check which controls vary
for control in ['income', 'population']:
    varying = data.groupby('unit')[control].nunique()
    print(f"{control}: {(varying > 1).sum()} units vary")

Solution:

# Use baseline (first period) value
baseline = data.groupby('unit')['income'].first().reset_index()
baseline.columns = ['unit', 'income_baseline']
data = data.drop('income', axis=1).merge(baseline, on='unit')

Exceptions Module (exceptions)

Overview

Exception Hierarchy

Exception Classes

LWDIDError

InsufficientDataError

InvalidParameterError

InvalidRollingMethodError

InvalidVCETypeError

InvalidStaggeredDataError

NoNeverTreatedError

Data Validation Errors

Estimation Errors

Error Handling Best Practices

Catch Specific Exceptions

Catch All Package Errors

Logging Errors

Graceful Degradation

Common Error Scenarios

Scenario 1: Insufficient Pre-Treatment Periods

Scenario 2: Time-Varying Controls

See Also