Skip to content

Commit d23b8c3

Browse files
igerberclaude
andcommitted
Add survey data support for SyntheticDiD and TROP (Phase 5)
pweight-only survey integration for the last two estimators without survey support. SDID: both-sides weighted (WLS interpretation) with treated means survey-weighted and omega composed with control survey weights post-optimization. TROP: survey weights in ATT aggregation only. Rust backend updated for both bootstrap functions. Includes 26 new tests, REGISTRY.md methodology notes, and roadmap updates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent aec5671 commit d23b8c3

12 files changed

Lines changed: 1309 additions & 278 deletions

File tree

TODO.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ Deferred items from PR reviews that were not addressed before merge.
5454
| Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium |
5555
| CallawaySantAnna survey: strata/PSU/FPC rejected at runtime. Full design-based SEs require routing the combined IF/WIF through `compute_survey_vcov()`. Currently weights-only. | `staggered.py` | #233 | Medium |
5656
| CallawaySantAnna survey + covariates + IPW/DR: DRDID panel nuisance-estimation IF corrections not implemented. Currently gated with NotImplementedError. Regression method with covariates works (has WLS nuisance IF correction). | `staggered.py` | #233 | Medium |
57+
| SyntheticDiD/TROP survey: strata/PSU/FPC deferred. Full design-based bootstrap (Rao-Wu rescaled weights) needed for survey-aware resampling. Currently pweight-only. | `synthetic_did.py`, `trop.py` || Medium |
5758
| EfficientDiD hausman_pretest() clustered covariance uses stale `n_cl` after filtering non-finite EIF rows — should recompute effective cluster count and remap indices after `row_finite` filtering | `efficient_did.py` | #230 | Medium |
5859
| EfficientDiD `control_group="last_cohort"` trims at `last_g - anticipation` but REGISTRY says `t >= last_g`. With `anticipation=0` (default) these are identical. With `anticipation>0`, code is arguably more conservative (excludes anticipation-contaminated periods). Either align REGISTRY with code or change code to `t < last_g` — needs design decision. | `efficient_did.py` | #230 | Low |
5960
| TripleDifference power: `generate_ddd_data` is a fixed 2×2×2 cross-sectional DGP — no multi-period or unbalanced-group support. Add a `generate_ddd_panel_data` for panel DDD power analysis. | `prep_dgp.py`, `power.py` | #208 | Low |

diff_diff/results.py

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -680,6 +680,8 @@ class SyntheticDiDResults:
680680
pre_treatment_fit: Optional[float] = field(default=None)
681681
placebo_effects: Optional[np.ndarray] = field(default=None)
682682
n_bootstrap: Optional[int] = field(default=None)
683+
# Survey design metadata (SurveyMetadata instance from diff_diff.survey)
684+
survey_metadata: Optional[Any] = field(default=None)
683685

684686
def __repr__(self) -> str:
685687
"""Concise string representation."""
@@ -735,6 +737,28 @@ def summary(self, alpha: Optional[float] = None) -> str:
735737
if self.variance_method == "bootstrap" and self.n_bootstrap is not None:
736738
lines.append(f"{'Bootstrap replications:':<25} {self.n_bootstrap:>10}")
737739

740+
# Add survey design info
741+
if self.survey_metadata is not None:
742+
sm = self.survey_metadata
743+
lines.extend(
744+
[
745+
"",
746+
"-" * 75,
747+
"Survey Design".center(75),
748+
"-" * 75,
749+
f"{'Weight type:':<25} {sm.weight_type:>10}",
750+
]
751+
)
752+
if sm.n_strata is not None:
753+
lines.append(f"{'Strata:':<25} {sm.n_strata:>10}")
754+
if sm.n_psu is not None:
755+
lines.append(f"{'PSU/Cluster:':<25} {sm.n_psu:>10}")
756+
lines.append(f"{'Effective sample size:':<25} {sm.effective_n:>10.1f}")
757+
lines.append(f"{'Design effect (DEFF):':<25} {sm.design_effect:>10.2f}")
758+
if sm.df_survey is not None:
759+
lines.append(f"{'Survey d.f.:':<25} {sm.df_survey:>10}")
760+
lines.append("-" * 75)
761+
738762
lines.extend(
739763
[
740764
"",
@@ -812,6 +836,17 @@ def to_dict(self) -> Dict[str, Any]:
812836
}
813837
if self.n_bootstrap is not None:
814838
result["n_bootstrap"] = self.n_bootstrap
839+
if self.survey_metadata is not None:
840+
sm = self.survey_metadata
841+
result["weight_type"] = sm.weight_type
842+
result["effective_n"] = sm.effective_n
843+
result["design_effect"] = sm.design_effect
844+
if sm.n_strata is not None:
845+
result["n_strata"] = sm.n_strata
846+
if sm.n_psu is not None:
847+
result["n_psu"] = sm.n_psu
848+
if sm.df_survey is not None:
849+
result["df_survey"] = sm.df_survey
815850
return result
816851

817852
def to_dataframe(self) -> pd.DataFrame:

0 commit comments

Comments
 (0)