Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: 3.9
python-version: 3.11

- name: Install poetry
uses: snok/install-poetry@v1
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
strategy:
fail-fast: true
matrix:
python-version: ["3.9", "3.10", "3.11", "3.12"]
python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]

steps:
- uses: actions/checkout@v4
Expand Down
12 changes: 12 additions & 0 deletions docs/source/change-log.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,18 @@ This project adheres to `Semantic Versioning`_.
.. _Semantic Versioning: http://semver.org/


0.8.0
-----
2026-02-?

* New Features

* Support for IPUMS DHS extract API

* Breaking Changes

* Dropped support for Python 3.9

0.7.0
-----
2025-06-09
Expand Down
2 changes: 1 addition & 1 deletion docs/source/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Getting Started
Installation
------------

This package requires that you have at least Python 3.9 installed.
This package requires that you have at least Python 3.10 installed.

Install with ``pip``:

Expand Down
5 changes: 5 additions & 0 deletions docs/source/ipums_api/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,11 @@ available features for all collections currently supported by the API:
- ``mtus``
- **X**
-
* - `IPUMS DHS <https://www.idhsdata.org/idhs/index.shtml>`__
- Microdata
- ``dhs``
- **X**
-
* - `IPUMS NHIS <https://nhis.ipums.org/nhis/>`__
- Microdata
- ``nhis``
Expand Down
4 changes: 2 additions & 2 deletions docs/source/ipums_api/ipums_api_micro/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ The table below shows the available data structures and the IPUMS data collectio
- all IPUMS microdata collections
* - hierarchical
- ``data_structure={"hierarchical": {}}``
- all IPUMS microdata collections
- ``usa``, ``cps``, ``atus``, ``ahtus``, ``mtus``, ``nhis``, ``meps``, ``ipumsi``
* - rectangular on Activity
- ``data_structure={"rectangular": {"on": "A"}}``
- ``atus``, ``ahtus``, ``mtus``
Expand Down Expand Up @@ -132,7 +132,7 @@ Extract Features
----------------

Certain features of a :class:`MicrodataExtract<ipumspy.api.extract.MicrodataExtract>` can be
added or updated before an extract request is submitted. This section
added or updated before an extract request is submitted. Note that not all features are availale for every IPUMS data collection. For example, case selection is not a feature of IPUMS Time Use data and automatic inclusion of data quality flags is not a feature of IPUMS DHS. Details regarding available extract features for each IPUMS data collection can be found at the `IPUMS developer portal <https://developer.ipums.org/docs/v2/apiprogram/apis/microdata/>`_. This section
demonstrates adding features to the following IPUMS CPS extract.

.. code:: python
Expand Down
3,129 changes: 1,593 additions & 1,536 deletions poetry.lock

Large diffs are not rendered by default.

20 changes: 10 additions & 10 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "ipumspy"
version = "0.7.0"
version = "0.8.0"
description = "A collection of tools for working with IPUMS data"
authors = ["Kevin H. Wilson <kevin_wilson@brown.edu>",
"Renae Rodgers <rodge103@umn.edu>"]
Expand All @@ -10,29 +10,29 @@ packages = [
]

[tool.poetry.dependencies]
python = "^3.9"
python = "^3.10"
pandas = "^2.2.3"
numpy = "^2.0.0"
click = "^8.0.0"
pyarrow = "^18.1.0"
requests = {extras = ["use_chardet_on_py3"], version = "^2.26.0"}
pyarrow = "^23.0.0"
requests = {extras = ["use_chardet_on_py3"], version = "^2.32.5"}
importlib-metadata = "^4.13.0"
PyYAML = "^6.0.1"

[tool.poetry.dev-dependencies]
[tool.poetry.group.dev.dependencies]
black = "^24.4.0"
pylint = "^2.7.4"
isort = "^5.8.0"
mypy = "^1.0.1"

[tool.poetry.group.test.dependencies]
pytest = "^8.3.4"
pytest = "^9.0.0"
pytest-cov = "^4.0.0"
python-dotenv = "^1.0.0"
fastapi = "^0.115.12"
uvicorn = {extras = ["standard"], version = "^0.34.1"}
pytest-recording = "^0.12.2"
vcrpy = "^4.2.1"
fastapi = "^0.128.1"
uvicorn = {extras = ["standard"], version = "^0.40.0"}
pytest-recording = "^0.13.4"
vcrpy = "^8.0.0"

[tool.poetry.group.docs]
optional=true
Expand Down
3 changes: 2 additions & 1 deletion src/ipumspy/api/extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,7 @@ def _get_collection_type(collection: str) -> str:
"mtus": "microdata",
"nhis": "microdata",
"meps": "microdata",
"dhs": "microdata",
"nhgis": "aggregate_data",
"ihgis": "aggregate_data",
}
Expand Down Expand Up @@ -680,7 +681,7 @@ def build(self) -> Dict[str, Any]:
}

# XXX shoehorn fix until server-side bug is fixed
if self.collection == "meps":
if self.collection == "meps" or self.collection == "dhs":
for variable in built["variables"].keys():
built["variables"][variable].pop("attachedCharacteristics")

Expand Down
19 changes: 4 additions & 15 deletions src/ipumspy/readers.py
Original file line number Diff line number Diff line change
Expand Up @@ -217,16 +217,11 @@ def _get_common_vars(ddi: ddi_definitions.Codebook, data_description: List):
# these are delimited by spaces within the string attribute
# this list would probably be a useful thing to have as a file-level attribute...

# XXX: this is to work around an issue with the Health Surveys DDI specifically.
# Revert to previous method of using the file_description rectypes once this
# DDI issue is fixed
rectype_desc = [desc for desc in data_description if desc.name == "RECTYPE"][0]
all_rectypes = rectype_desc.rectype.split(" ")
common_vars = [
desc.name
for desc in data_description
# if sorted(desc.rectype.split(" ")) == sorted(ddi.file_description.rectypes)
if sorted(desc.rectype.split(" ")) == sorted(all_rectypes)
if sorted(desc.rectype.split(" ")) == sorted(ddi.file_description.rectypes)
# if sorted(desc.rectype.split(" ")) == sorted(all_rectypes)
]

return common_vars
Expand Down Expand Up @@ -355,14 +350,8 @@ def read_hierarchical_microdata(
else:
df_dict = {}
common_vars = _get_common_vars(ddi, data_description)
# XXX: this is to work around an issue with the Health Surveys DDI specifically.
# Revert to previous method of using the file_description rectypes once this
# DDI issue is fixed
rectype_desc = [desc for desc in data_description if desc.name == "RECTYPE"][0]
all_rectypes = rectype_desc.rectype.split(" ")

# for rectype in ddi.file_description.rectypes:
for rectype in all_rectypes:

for rectype in ddi.file_description.rectypes:
rectype_vars = _get_rectype_vars(
ddi, rectype, common_vars, data_description
)
Expand Down
Binary file not shown.
Loading
Loading