Inference Module (inference)
=============================

Advanced inference methods for difference-in-differences estimation.

This module provides inference methods beyond standard asymptotic approaches,
including wild cluster bootstrap for reliable inference with few clusters.

Overview
--------

Standard cluster-robust standard errors require a large number of clusters
for valid asymptotic inference. When the number of clusters is small (typically
fewer than 20-30), wild cluster bootstrap methods provide more accurate
inference by resampling cluster-level weights rather than relying on asymptotic
approximations.

This module implements the wild cluster bootstrap procedure with extensions for
difference-in-differences settings.

Wild Cluster Bootstrap
----------------------

.. autoclass:: lwdid.inference.WildClusterBootstrapResult
   :members:
   :no-index:

.. autofunction:: lwdid.inference.wild_cluster_bootstrap

.. autofunction:: lwdid.inference.wild_cluster_bootstrap_test_inversion

Example Usage
-------------

Basic Wild Cluster Bootstrap
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from lwdid.inference import wild_cluster_bootstrap
   
   # Run wild cluster bootstrap on transformed data
   result = wild_cluster_bootstrap(
       data=transformed_data,
       y_transformed='y_dot',
       d='treated',
       cluster_var='state',
       n_bootstrap=999,
       weight_type='rademacher'
   )
   
   print(f"Original ATT: {result.att:.4f}")
   print(f"Bootstrap SE: {result.se_bootstrap:.4f}")
   print(f"Bootstrap p-value: {result.pvalue:.4f}")
   print(f"95% CI: [{result.ci_lower:.4f}, {result.ci_upper:.4f}]")

Weight Types
~~~~~~~~~~~~

Three bootstrap weight distributions are available:

.. code-block:: python

   # Rademacher weights (+1 or -1 with equal probability)
   result_rad = wild_cluster_bootstrap(
       data, y_transformed='y_dot', d='treated',
       cluster_var='state', weight_type='rademacher'
   )
   
   # Mammen weights (two-point distribution)
   result_mam = wild_cluster_bootstrap(
       data, y_transformed='y_dot', d='treated',
       cluster_var='state', weight_type='mammen'
   )
   
   # Webb weights (six-point distribution)
   result_web = wild_cluster_bootstrap(
       data, y_transformed='y_dot', d='treated',
       cluster_var='state', weight_type='webb'
   )

Test Inversion for Confidence Intervals
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For more accurate confidence intervals with few clusters:

.. code-block:: python

   from lwdid.inference import wild_cluster_bootstrap_test_inversion
   
   # Construct CI via test inversion
   result = wild_cluster_bootstrap_test_inversion(
       data=transformed_data,
       y_transformed='y_dot',
       d='treated',
       cluster_var='state',
       alpha=0.05,
       n_bootstrap=999
   )
   
   print(f"CI via test inversion: [{result.ci_lower:.4f}, {result.ci_upper:.4f}]")

Integration with lwdid
~~~~~~~~~~~~~~~~~~~~~~

Wild cluster bootstrap is used as a standalone function after running the main
estimation. First estimate the model using ``lwdid()``, then apply wild cluster
bootstrap to the transformed data:

.. code-block:: python

   from lwdid import lwdid
   from lwdid.inference import wild_cluster_bootstrap
   
   # Step 1: Run standard estimation with cluster-robust SE
   results = lwdid(
       data, y='outcome', d='treated', ivar='unit', tvar='year',
       post='post', rolling='demean',
       vce='cluster', cluster_var='state'
   )
   
   # Step 2: Apply wild cluster bootstrap to the transformed data
   boot_result = wild_cluster_bootstrap(
       data=results.data,
       y_transformed='ydot_postavg',
       d='d_',
       cluster_var='state',
       n_bootstrap=999
   )
   
   print(f"Bootstrap p-value: {boot_result.pvalue:.4f}")

Complete Enumeration
~~~~~~~~~~~~~~~~~~~~

With small numbers of clusters (G <= 12), exact enumeration of all possible
weight combinations provides exact p-values:

.. code-block:: python

   # Complete enumeration with few clusters
   result = wild_cluster_bootstrap(
       data=transformed_data,
       y_transformed='y_dot',
       d='treated',
       cluster_var='state',  # If G <= 12, complete enumeration is used
       weight_type='rademacher'
   )
   
   # Check if complete enumeration was used
   if result.n_bootstrap == 2**result.n_clusters:
       print("Complete enumeration used - exact p-value")

Methodological Notes
--------------------

**Algorithm:**

1. Estimate the original model and obtain residuals :math:`\hat{u}_{ic}`
2. Generate cluster-level weights :math:`w_c` from chosen distribution
3. Construct bootstrap residuals: :math:`u^*_{ic} = w_c \times \hat{u}_{ic}`
4. Form bootstrap outcomes: :math:`Y^*_{ic} = \hat{Y}_{ic} + u^*_{ic}`
5. Re-estimate the model and compute t-statistic
6. Repeat B times to obtain bootstrap distribution
7. Compute p-value as proportion of bootstrap t-statistics exceeding observed

**Weight Distributions:**

- **Rademacher**: :math:`P(w = 1) = P(w = -1) = 0.5`
- **Mammen**: Two-point distribution with :math:`E[w] = 0`, :math:`E[w^2] = 1`, :math:`E[w^3] = 1`
- **Webb**: Six-point distribution for improved performance with few clusters

**Null Imposition:**

The ``impose_null`` parameter determines whether bootstrap samples are
constructed under the null hypothesis (H0: treatment effect = 0). Imposing
the null generally improves power but may be conservative.

Guidelines
----------

**When to Use Wild Cluster Bootstrap:**

- Number of clusters < 30
- Treatment varies at cluster level
- Standard cluster-robust SEs may be unreliable

**Recommended Settings:**

- ``n_bootstrap=999`` or ``n_bootstrap=9999`` for publication
- ``weight_type='rademacher'`` is standard choice
- ``impose_null=True`` for testing H0: effect = 0

See Also
--------

- :doc:`clustering_diagnostics` - Clustering diagnostics and recommendations
- :doc:`../methodological_notes` - Theoretical foundations