Skip to content

Commit eacd38c

Browse files
authored
Merge pull request #243 from igerber/docs/todo-cleanup
Clean up TODO.md: remove resolved items, update line counts
2 parents a0d3fe2 + 64006c8 commit eacd38c

1 file changed

Lines changed: 32 additions & 37 deletions

File tree

TODO.md

Lines changed: 32 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -19,25 +19,28 @@ Current limitations that may affect users:
1919

2020
### Large Module Files
2121

22-
Target: < 1000 lines per module for maintainability.
22+
Target: < 1000 lines per module for maintainability. Updated 2026-03-29.
2323

2424
| File | Lines | Action |
2525
|------|-------|--------|
26-
| `trop.py` | 862 | Split done — trop_global.py (990), trop_local.py (1000) |
27-
| `utils.py` | 1838 | Monitor |
28-
| `staggered.py` | 1785 | Monitor |
29-
| `imputation.py` | 1756 | Monitor |
30-
| `visualization.py` | 1727 | Monitor — growing but cohesive |
31-
| `linalg.py` | 1727 | Monitor — unified backend, splitting would hurt cohesion |
32-
| `triple_diff.py` | 1581 | Monitor |
26+
| `power.py` | 2588 | Consider splitting (power analysis + MDE + sample size) |
27+
| `linalg.py` | 2289 | Monitor — unified backend, splitting would hurt cohesion |
28+
| `staggered.py` | 2275 | Monitor — grew with survey support |
29+
| `imputation.py` | 2009 | Monitor |
30+
| `triple_diff.py` | 1921 | Monitor |
31+
| `utils.py` | 1902 | Monitor |
32+
| `two_stage.py` | 1708 | Monitor |
33+
| `survey.py` | 1646 | Monitor — grew with Phase 6 features |
34+
| `continuous_did.py` | 1626 | Monitor |
3335
| `honest_did.py` | 1511 | Acceptable |
34-
| `two_stage.py` | 1451 | Acceptable |
35-
| `power.py` | 1350 | Acceptable |
36-
| `prep.py` | 1242 | Acceptable |
37-
| `sun_abraham.py` | 1162 | Acceptable |
38-
| `continuous_did.py` | 1155 | Acceptable |
39-
| `estimators.py` | 1147 | Acceptable |
40-
| `pretrends.py` | 1104 | Acceptable |
36+
| `sun_abraham.py` | 1540 | Acceptable |
37+
| `estimators.py` | 1357 | Acceptable |
38+
| `trop_local.py` | 1261 | Acceptable |
39+
| `trop_global.py` | 1251 | Acceptable |
40+
| `prep.py` | 1225 | Acceptable |
41+
| `pretrends.py` | 1105 | Acceptable |
42+
| `trop.py` | 981 | Split done — trop_global.py + trop_local.py |
43+
| `visualization/` | 4172 | Subpackage (split across 7 files) — OK |
4144

4245
---
4346

@@ -49,20 +52,13 @@ Deferred items from PR reviews that were not addressed before merge.
4952

5053
| Issue | Location | PR | Priority |
5154
|-------|----------|----|----------|
52-
| ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails; fixing requires sparse least-squares alternatives) |
53-
| EfficientDiD: API docs / tutorial page for new public estimator | `docs/` | #192 | Medium |
55+
| ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails) |
5456
| Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium |
55-
| Replicate-weight survey df — **Resolved**. `df_survey = rank(replicate_weights) - 1` matching R's `survey::degf()`. For IF paths, `n_valid - 1` when dropped replicates reduce effective count. | `survey.py` | #238 | Resolved |
56-
| CallawaySantAnna survey: strata/PSU/FPC — **Resolved**. Aggregated SEs (overall, event study, group) use `compute_survey_if_variance()`. Bootstrap uses PSU-level multiplier weights. | `staggered.py` | #237 | Resolved |
57-
| CallawaySantAnna survey + covariates + IPW/DR: DRDID panel nuisance-estimation IF corrections not implemented. Currently gated with NotImplementedError. Regression method with covariates works (has WLS nuisance IF correction). | `staggered.py` | #233 | Medium |
58-
| SyntheticDiD/TROP survey: strata/PSU/FPC — **Resolved**. Rao-Wu rescaled bootstrap implemented for both. TROP uses cross-classified pseudo-strata. Rust TROP remains pweight-only (Python fallback for full design). | `synthetic_did.py`, `trop.py` || Resolved |
59-
| EfficientDiD hausman_pretest() clustered covariance stale `n_cl`**Resolved**. Recompute `n_cl` and remap indices after `row_finite` filtering via `np.unique(return_inverse=True)`. | `efficient_did.py` | #230 | Resolved |
60-
| EfficientDiD `control_group="last_cohort"` trims at `last_g - anticipation` but REGISTRY says `t >= last_g`. With `anticipation=0` (default) these are identical. With `anticipation>0`, code is arguably more conservative (excludes anticipation-contaminated periods). Either align REGISTRY with code or change code to `t < last_g` — needs design decision. | `efficient_did.py` | #230 | Low |
61-
| TripleDifference power: `generate_ddd_data` is a fixed 2×2×2 cross-sectional DGP — no multi-period or unbalanced-group support. Add a `generate_ddd_panel_data` for panel DDD power analysis. | `prep_dgp.py`, `power.py` | #208 | Low |
62-
| ContinuousDiD event-study aggregation anticipation filter — **Resolved**. `_aggregate_event_study()` now filters `e < -anticipation` when `anticipation > 0`, matching CallawaySantAnna behavior. Bootstrap paths also filtered. | `continuous_did.py` | #226 | Resolved |
63-
| Survey design resolution/collapse patterns are inconsistent across panel estimators — ContinuousDiD rebuilds unit-level design in SE code, EfficientDiD builds once in fit(), StackedDiD re-resolves on stacked data; extract shared helpers for panel-to-unit collapse, post-filter re-resolution, and metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
64-
| Survey metadata formatting dedup — **Resolved**. Extracted `_format_survey_block()` helper in `results.py`, replaced 13 occurrences across 11 files. | `results.py` + 10 results files || Resolved |
65-
| TROP: `fit()` and `_fit_global()` share ~150 lines of near-identical data setup (panel pivoting, absorbing-state validation, first-treatment detection, effective rank, NaN warnings). Both bootstrap methods also duplicate the stratified resampling loop. Extract shared helpers to eliminate cross-file sync risk. | `trop.py`, `trop_global.py`, `trop_local.py` || Low |
57+
| CallawaySantAnna survey + covariates + IPW/DR: DRDID panel nuisance-estimation IF corrections not implemented. Currently gated with NotImplementedError. Regression method with covariates works. | `staggered.py` | #233 | Medium — tracked in Survey Phase 7a |
58+
| EfficientDiD `control_group="last_cohort"` trims at `last_g - anticipation` but REGISTRY says `t >= last_g`. With `anticipation=0` (default) these are identical. Needs design decision for `anticipation>0`. | `efficient_did.py` | #230 | Low |
59+
| TripleDifference power: `generate_ddd_data` is a fixed 2×2×2 cross-sectional DGP — no multi-period or unbalanced-group support. | `prep_dgp.py`, `power.py` | #208 | Low |
60+
| Survey design resolution/collapse patterns inconsistent across panel estimators — extract shared helpers for panel-to-unit collapse, post-filter re-resolution, metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
61+
| TROP: `fit()` and `_fit_global()` share ~150 lines of near-identical data setup. Extract shared helpers to eliminate cross-file sync risk. | `trop.py`, `trop_global.py`, `trop_local.py` || Low |
6662

6763
#### Performance
6864

@@ -75,12 +71,11 @@ Deferred items from PR reviews that were not addressed before merge.
7571

7672
| Issue | Location | PR | Priority |
7773
|-------|----------|----|----------|
74+
| Plotly renderers silently ignore styling kwargs (marker, markersize, linewidth, capsize, ci_linewidth) that the matplotlib backend honors; thread them through or reject when `backend="plotly"` | `visualization/_event_study.py`, `_diagnostic.py`, `_power.py` | #222 | Medium |
7875
| R comparison tests spawn separate `Rscript` per test (slow CI) | `tests/test_methodology_twfe.py:294` | #139 | Low |
7976
| CS R helpers hard-code `xformla = ~ 1`; no covariate-adjusted R benchmark for IRLS path | `tests/test_methodology_callaway.py` | #202 | Low |
80-
| ~376 `duplicate object description` Sphinx warnings — caused by autodoc `:members:` on dataclass attributes within manual API pages (not from autosummary stubs); fix requires restructuring `docs/api/*.rst` pages to avoid documenting the same attribute via both `:members:` and inline `autosummary` tables | `docs/api/*.rst` || Low |
81-
| Plotly renderers silently ignore styling kwargs (marker, markersize, linewidth, capsize, ci_linewidth) that the matplotlib backend honors; thread them through or reject when `backend="plotly"` | `visualization/_event_study.py`, `_diagnostic.py`, `_power.py` | #222 | Medium |
82-
| Survey bootstrap test coverage — **Resolved**. Added FPC census zero-variance, single-PSU NaN, full-design bootstrap for CS/ContinuousDiD/EfficientDiD, and TROP Rao-Wu vs block bootstrap equivalence tests. | `tests/test_survey_phase*.py` || Resolved |
83-
| Doc-snippet smoke tests only cover `.rst` files; new `.txt` AI guides are outside CI validation | `tests/test_doc_snippets.py` | #239 | Low |
77+
| ~376 `duplicate object description` Sphinx warnings — restructure `docs/api/*.rst` to avoid duplicate `:members:` + `autosummary` | `docs/api/*.rst` || Low |
78+
| Doc-snippet smoke tests only cover `.rst` files; `.txt` AI guides outside CI validation | `tests/test_doc_snippets.py` | #239 | Low |
8479

8580
---
8681

@@ -164,11 +159,11 @@ Spurious RuntimeWarnings ("divide by zero", "overflow", "invalid value") are emi
164159

165160
Features in R's `did` package that block porting additional tests:
166161

167-
| Feature | R tests blocked | Priority |
168-
|---------|----------------|----------|
169-
| Repeated cross-sections (`panel=FALSE`) | ~7 tests in test-att_gt.R + test-user_bug_fixes.R | Medium |
170-
| Sampling/population weights | 7 tests incl. all JEL replication | Medium |
171-
| Calendar time aggregation | 1 test in test-att_gt.R | Low |
162+
| Feature | R tests blocked | Priority | Status |
163+
|---------|----------------|----------|--------|
164+
| Repeated cross-sections (`panel=FALSE`) | ~7 tests in test-att_gt.R + test-user_bug_fixes.R | High | Planned — Survey Phase 7b |
165+
| Sampling/population weights | 7 tests incl. all JEL replication | Medium | Mostly resolved (Phases 1-6); CS IPW/DR + covariates + survey in Phase 7a |
166+
| Calendar time aggregation | 1 test in test-att_gt.R | Low | |
172167

173168
---
174169

0 commit comments

Comments
 (0)