Data, analysis, and figures for the paper "Public Power, Private Debts: The Misuse of California's Tax Intercept Power for Private Colleges" by Dalié Jiménez, Jonathan Glater, Andrew Martin & Charlie Eaton, forthcoming in the Yale Law Journal Forum (2026).
This repository contains the data pipeline and analysis for a study of California's Interagency Intercept Collection (IIC) program, focusing on private colleges and universities. The project extracts offset data from public records obtained via the California Public Records Act (PRA), cleans and normalizes it, and produces the figures and tables used in the paper.
CA-Private-Colleges/
|-- PRA Response/
|-- code/
|-- data/
| |-- raw/
| |-- cleaned/
|-- figures/
|-- README.md
Unedited responses from the Franchise Tax Board (OCR added to enable text extraction, but content is otherwise unmodified).
Scripts are numbered in pipeline order:
| File | Description |
|---|---|
0_extract_iic_offset_2018_2023.py |
Extracts tabular data from PRA response PDFs into a normalized CSV |
1_clean_iic_data.R |
Cleans and normalizes the extracted CSV for analysis |
2_verify_extraction.R |
Verification checks on the extracted and cleaned data |
3_iic_offset_data_analysis.R |
Main analysis: figures and tables for the paper |
4_iic_agency_enrollments_analysis.r |
Enrollment timeline analysis and figures |
PROJECT CONTEXT.md |
Detailed documentation of transformation rules, parser behavior, and data quality notes |
raw/— Raw CSV data extracted directly from the PRA response documentscleaned/— Cleaned and normalized datasets used to create the figures and tables in the paper
All figures (charts) and tables generated by the analysis code, as used in the paper.
- Extract data from PDFs (requires Python 3 +
pdfplumber):python3 code/0_extract_iic_offset_2018_2023.py Run the R scripts in order (requires R + packages listed below): 1_clean_iic_data.R 2_verify_extraction.R 3_iic_offset_data_analysis.R 4_iic_agency_enrollments_analysis.r
Dependencies
Python 3 with pdfplumber (pip install pdfplumber)
R with tidyverse, lubridate, scales, readr, flextable
All source data was obtained via PRA requests to the California Franchise Tax Board. Original documents are preserved unmodified in PRA Response/. Citation
If you use this data or code, please cite:
Jiménez, Dalié, Jonathan Glater, Andrew Martin & Charlie Eaton. "Public Power, Private Debts: The Misuse of California's Tax Intercept Power for Private Colleges." Yale Law Journal Forum (forthcoming 2026).
This work is licensed under a Creative Commons Attribution 4.0 International License (CC-BY-4.0).
You are free to share and adapt this material for any purpose, provided you give appropriate attribution to the authors and the Yale Law Journal Forum.