Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
04bd263
Survey Phase 7: CS IPW/DR covariates, repeated cross-sections, Honest…
igerber Mar 28, 2026
4bf566d
Address CI review: RCS IF corrections, aggregation weights, replicate…
igerber Mar 29, 2026
6080f92
Fix DR RC normalizer mismatch, holistic RCS cohort-mass weighting, un…
igerber Mar 29, 2026
b623dee
Rewrite RC reg/DR to match DRDID::reg_did_rc and DRDID::drdid_rc form…
igerber Mar 29, 2026
3b405b7
Fix bootstrap RCS cohort-mass weighting, reset stale event-study VCV
igerber Mar 29, 2026
53cfd5d
Clear analytical event_study_vcov when bootstrap overwrites event-stu…
igerber Mar 29, 2026
9ff21a2
Fix RC IF normalization scaling: M1 uses n_all denominator, PS M2 use…
igerber Mar 29, 2026
c2f8fdc
Document RCS IF phi=psi/n convention, add analytical-vs-bootstrap SE …
igerber Mar 29, 2026
cb3f815
Refactor RC IFs to R's psi convention, fix HonestDiD VCV subsetting
igerber Mar 29, 2026
7e127fb
Resolve merge conflict, match R colMeans convention in panel IPW/DR M…
igerber Mar 29, 2026
eac680e
Match R's H/n, asy_rep/n, colMeans convention for panel PS correction…
igerber Mar 29, 2026
9893454
Fix VCV index alignment, add stationarity warning for panel=False
igerber Mar 29, 2026
4415034
Document panel DR control-augmentation normalization deviation from D…
igerber Mar 29, 2026
1c35440
Warn on non-universal base period in HonestDiD CS path, update tests
igerber Mar 29, 2026
867cd51
Fix panel M2 full-sample colMeans, add HonestDiD consecutive event-ti…
igerber Mar 29, 2026
9f3cab4
Fix HonestDiD grid validator for reference-period gap, defensive boot…
igerber Mar 29, 2026
c529053
HonestDiD: raise ValueError on non-consecutive event-time grid (was w…
igerber Mar 29, 2026
e9995ef
HonestDiD: validate full grid around reference period, not just withi…
igerber Mar 29, 2026
1f8a537
Fix HonestDiD: reference-aware pre/post split, replicate df=0 sentinel
igerber Mar 29, 2026
c5015c7
Fix _estimate_max_pre_violation to use reference-aware pre_periods
igerber Mar 29, 2026
7dc09e8
Document panel IPW/DR PS correction scaling with bootstrap convergenc…
igerber Mar 30, 2026
4073fa8
HonestDiD: raise on no pre-period coefficients in CS path, remove ove…
igerber Mar 30, 2026
4d65f75
Fix HonestDiD varying-base split: use t<0/t>=0 when no reference marker
igerber Mar 30, 2026
aaa2b20
Fix reference-period detection (effect=0.0+NaN SE), warn on bootstrap…
igerber Mar 30, 2026
9c8ebb4
Relabel IF scaling as implementation choice (not deviation), fix cont…
igerber Mar 30, 2026
df21c7e
Fix panel M2 scaling: revert to np.mean over control rows (n_c denomi…
igerber Mar 30, 2026
6789e37
Match R exactly: H = X'WX (no /n), asy_rep = score @ inv(H) (no /n)
igerber Mar 30, 2026
b889a14
Restructure all PS corrections to R's psi convention with explicit ph…
igerber Mar 30, 2026
81724ca
Reframe REGISTRY IF scaling note as implementation choice with bootst…
igerber Mar 30, 2026
43c5547
Fix PS nuisance IF correction scaling: remove extra /n division
igerber Mar 30, 2026
c0c7e5a
Fix M2 gradient scaling: use np.sum instead of np.mean over control s…
igerber Mar 30, 2026
527618f
Fix IPW PS correction sign and merge main
igerber Mar 30, 2026
9086114
Fix replicate-weight df propagation: return per-statistic df instead …
igerber Mar 30, 2026
08822e1
Fix StaggeredTripleDifference unpack for 3-tuple _aggregate_simple re…
igerber Mar 31, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 20 additions & 8 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,37 @@ For past changes and release history, see [CHANGELOG.md](CHANGELOG.md).

## Current Status

diff-diff v2.6.0 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis:
diff-diff v2.7.5 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis, plus **unique survey support** — design-based variance estimation (Taylor linearization, replicate weights) integrated across all estimators. No R or Python package offers this combination:

- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Borusyak-Jaravel-Spiess Imputation, Synthetic DiD, Triple Difference (DDD), TROP, Two-Stage DiD (Gardner 2022), Stacked DiD (Wing et al. 2024), Continuous DiD (Callaway, Goodman-Bacon & Sant'Anna 2024)
- **Valid inference**: Robust SEs, cluster SEs, wild bootstrap, multiplier bootstrap, placebo-based variance
- **Assumption diagnostics**: Parallel trends tests, placebo tests, Goodman-Bacon decomposition
- **Sensitivity analysis**: Honest DiD (Rambachan-Roth), Pre-trends power analysis (Roth 2022)
- **Study design**: Power analysis tools
- **Data utilities**: Real-world datasets (Card-Krueger, Castle Doctrine, Divorce Laws, MPDTA), DGP functions for all supported designs
- **Survey support**: Full `SurveyDesign` with strata, PSU, FPC, weight types, replicate weights (BRR/Fay/JK1/JKn), Taylor linearization, DEFF diagnostics, subpopulation analysis — integrated across all estimators (see [survey-roadmap.md](docs/survey-roadmap.md))
- **Performance**: Optional Rust backend for accelerated computation; faster than R at scale (see [CHANGELOG.md](CHANGELOG.md) for benchmarks)

---

## Near-Term Enhancements (v2.7)
## Near-Term Enhancements (v2.8)

### Survey Phase 7: Completing the Survey Story

Close the remaining gaps for practitioners using major population surveys
(ACS, CPS, BRFSS, MEPS). See [survey-roadmap.md](docs/survey-roadmap.md) for
full details.

- **CS Covariates + IPW/DR + Survey** *(Implemented)*: DRDID nuisance IF
corrections (PS + OR) under survey weights for all estimation methods.
- **Repeated Cross-Sections** *(Implemented)*: `panel=False` support for
CallawaySantAnna using cross-sectional DRDID (Sant'Anna & Zhao 2020,
Section 4). Supports BRFSS, ACS annual, CPS monthly.
- **Survey-Aware DiD Tutorial** *(Open)*: Jupyter notebook demonstrating
the full workflow with realistic survey data.
- **HonestDiD + Survey Variance** *(Implemented)*: Survey df and full
event-study VCV propagated to sensitivity analysis, with bootstrap/replicate
diagonal fallback.

### Staggered Triple Difference (DDD)

Expand All @@ -32,12 +50,6 @@ Extend the existing `TripleDifference` estimator to handle staggered adoption se

**Reference**: [Ortiz-Villavicencio & Sant'Anna (2025)](https://arxiv.org/abs/2505.09942). "Better Understanding Triple Differences Estimators." *Working Paper*. R package: `triplediff`.

### Enhanced Visualization

- Synthetic control weight visualization (bar chart of unit weights)
- Treatment adoption "staircase" plot for staggered designs
- Interactive plots with plotly backend option

---

## Medium-Term Enhancements
Expand Down
20 changes: 13 additions & 7 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,17 @@ Deferred items from PR reviews that were not addressed before merge.
|-------|----------|----|----------|
| ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails) |
| Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium |
| CallawaySantAnna survey + covariates + IPW/DR: DRDID panel nuisance-estimation IF corrections not implemented. Currently gated with NotImplementedError. Regression method with covariates works. | `staggered.py` | #233 | Medium — tracked in Survey Phase 7a |
| EfficientDiD `control_group="last_cohort"` trims at `last_g - anticipation` but REGISTRY says `t >= last_g`. With `anticipation=0` (default) these are identical. Needs design decision for `anticipation>0`. | `efficient_did.py` | #230 | Low |
| TripleDifference power: `generate_ddd_data` is a fixed 2×2×2 cross-sectional DGP — no multi-period or unbalanced-group support. | `prep_dgp.py`, `power.py` | #208 | Low |
| Survey design resolution/collapse patterns inconsistent across panel estimators — extract shared helpers for panel-to-unit collapse, post-filter re-resolution, metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
| TROP: `fit()` and `_fit_global()` share ~150 lines of near-identical data setup. Extract shared helpers to eliminate cross-file sync risk. | `trop.py`, `trop_global.py`, `trop_local.py` | — | Low |
| Replicate-weight survey df — **Resolved**. `df_survey = rank(replicate_weights) - 1` matching R's `survey::degf()`. For IF paths, `n_valid - 1` when dropped replicates reduce effective count. | `survey.py` | #238 | Resolved |
| CallawaySantAnna survey: strata/PSU/FPC — **Resolved**. Aggregated SEs (overall, event study, group) use `compute_survey_if_variance()`. Bootstrap uses PSU-level multiplier weights. | `staggered.py` | #237 | Resolved |
| CallawaySantAnna survey + covariates + IPW/DR — **Resolved**. DRDID panel nuisance IF corrections (PS + OR) implemented for both survey and non-survey DR paths (Phase 7a). IPW path unblocked. | `staggered.py` | #233 | Resolved |
| SyntheticDiD/TROP survey: strata/PSU/FPC — **Resolved**. Rao-Wu rescaled bootstrap implemented for both. TROP uses cross-classified pseudo-strata. Rust TROP remains pweight-only (Python fallback for full design). | `synthetic_did.py`, `trop.py` | — | Resolved |
| EfficientDiD hausman_pretest() clustered covariance stale `n_cl` — **Resolved**. Recompute `n_cl` and remap indices after `row_finite` filtering via `np.unique(return_inverse=True)`. | `efficient_did.py` | #230 | Resolved |
| EfficientDiD `control_group="last_cohort"` trims at `last_g - anticipation` but REGISTRY says `t >= last_g`. With `anticipation=0` (default) these are identical. With `anticipation>0`, code is arguably more conservative (excludes anticipation-contaminated periods). Either align REGISTRY with code or change code to `t < last_g` — needs design decision. | `efficient_did.py` | #230 | Low |
| TripleDifference power: `generate_ddd_data` is a fixed 2×2×2 cross-sectional DGP — no multi-period or unbalanced-group support. Add a `generate_ddd_panel_data` for panel DDD power analysis. | `prep_dgp.py`, `power.py` | #208 | Low |
| ContinuousDiD event-study aggregation anticipation filter — **Resolved**. `_aggregate_event_study()` now filters `e < -anticipation` when `anticipation > 0`, matching CallawaySantAnna behavior. Bootstrap paths also filtered. | `continuous_did.py` | #226 | Resolved |
| Survey design resolution/collapse patterns are inconsistent across panel estimators — ContinuousDiD rebuilds unit-level design in SE code, EfficientDiD builds once in fit(), StackedDiD re-resolves on stacked data; extract shared helpers for panel-to-unit collapse, post-filter re-resolution, and metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
| Survey metadata formatting dedup — **Resolved**. Extracted `_format_survey_block()` helper in `results.py`, replaced 13 occurrences across 11 files. | `results.py` + 10 results files | — | Resolved |
| TROP: `fit()` and `_fit_global()` share ~150 lines of near-identical data setup (panel pivoting, absorbing-state validation, first-treatment detection, effective rank, NaN warnings). Both bootstrap methods also duplicate the stratified resampling loop. Extract shared helpers to eliminate cross-file sync risk. | `trop.py`, `trop_global.py`, `trop_local.py` | — | Low |
| StaggeredTripleDifference R cross-validation: CSV fixtures not committed (gitignored); tests skip without local R + triplediff. Commit fixtures or generate deterministically. | `tests/test_methodology_staggered_triple_diff.py` | #245 | Medium |
| StaggeredTripleDifference R parity: benchmark only tests no-covariate path (xformla=~1). Add covariate-adjusted scenarios and aggregation SE parity assertions. | `benchmarks/R/benchmark_staggered_triplediff.R` | #245 | Medium |
| StaggeredTripleDifference: per-cohort group-effect SEs include WIF (conservative vs R's wif=NULL). Documented in REGISTRY. Could override mixin for exact R match. | `staggered_triple_diff.py` | #245 | Low |
Expand Down Expand Up @@ -163,8 +169,8 @@ Features in R's `did` package that block porting additional tests:

| Feature | R tests blocked | Priority | Status |
|---------|----------------|----------|--------|
| Repeated cross-sections (`panel=FALSE`) | ~7 tests in test-att_gt.R + test-user_bug_fixes.R | High | PlannedSurvey Phase 7b |
| Sampling/population weights | 7 tests incl. all JEL replication | Medium | Mostly resolved (Phases 1-6); CS IPW/DR + covariates + survey in Phase 7a |
| Repeated cross-sections (`panel=FALSE`) | ~7 tests in test-att_gt.R + test-user_bug_fixes.R | High | **Resolved** — Phase 7b: `panel=False` on CallawaySantAnna |
| Sampling/population weights | 7 tests incl. all JEL replication | Medium | **Resolved** (Phases 1-6 + 7a: CS IPW/DR + covariates + survey) |
| Calendar time aggregation | 1 test in test-att_gt.R | Low | |

---
Expand Down
Loading
Loading