Skip to content

Commit de58260

Browse files
committed
Updates for v1.1.3:
1. Pull request (Added dvh constraint using cvar). Update clinical_criteria.py 2. Create download script for downloading data from hugging face 3. Add unit test for cvar 4. Update basic tutorial
1 parent 742ece6 commit de58260

9 files changed

Lines changed: 427 additions & 106 deletions

File tree

README.md

Lines changed: 40 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -196,16 +196,50 @@ PortPy equips researchers with a robust benchmark patient dataset, sourced from
196196
Currently, this set encompasses only the Lung 2Gy×30 protocol but will be expanded in the future to more protocols as well as TCP/NTCP evaluation functions.
197197

198198
To access these resources, users are advised to download the latest version of the dataset,
199-
which can be found on hugging face dataset [PortPy_Dataset](https://huggingface.co/datasets/PortPy-Project/PortPy_Dataset).
200-
You can also browse and download patient data manually using our interactive [Hugging Face Space](https://huggingface.co/spaces/PortPy-Project/portpy_dataset_visualization).
199+
which can be found on hugging face dataset [PortPy_Dataset](https://huggingface.co/datasets/PortPy-Project/PortPy_Dataset).
200+
## Downloading the PortPy Dataset
201+
202+
PortPy datasets are large (GBs per patient) and are **not bundled** with the Python package.
203+
You have **two options** to download the dataset:
204+
205+
1. **CLI (recommended)**: `download_portpy_data`
206+
2. **Python**: `pp.download_portpy_data(...)`
207+
208+
---
209+
210+
### Option 1: CLI
211+
212+
Install the optional dependency and download the data using the command line interface (CLI):
213+
214+
```bash
215+
pip install "portpy[data]"
216+
download_portpy_data --patients Lung_Patient_3 Lung_Patient_4 --beam-mode planner --out ./PortPy_Dataset
217+
```
218+
- `--patients`: List of patient IDs to download. You can find the list of available patient IDs on the [Hugging Face Dataset Page](https://huggingface.co/datasets/PortPy-Project/PortPy_Dataset).
219+
- `--beam-mode`: Options are "all" or "planner" or beam ids list e.g., [0,10,20]. Default is "planner"
220+
- `--out`: Output directory to save the downloaded data.
221+
222+
### Option 2: Python (pp.download_portpy_data)
223+
Option 2: Python
224+
225+
Use the same downloader from Python (useful for python, notebooks and pipelines):
226+
```bash
227+
import portpy.photon as pp
228+
229+
pp.download_portpy_data(
230+
["Lung_Patient_3", "Lung_Patient_4"],
231+
out="./", # output directory
232+
beam_mode="planner", # options: "all" or "planner" or beam ids list e.g., [0,10,20]. Default is "planner"
233+
)
234+
```
235+
201236
Subsequently, create a directory titled './data' in the current project directory and transfer the downloaded
202-
file into it. For example, ./data/Lung_Phantom_Patient_1.
237+
file into it. For example, ./data/Lung_Patient_3.
238+
You can also browse and download patient data manually (not recommended. Usually slow) using our interactive [Hugging Face Space](https://huggingface.co/spaces/PortPy-Project/portpy_dataset_visualization).
203239
We have adopted the widely-used JSON and HDF5 formats for data storage.
204240
[HDFViwer](https://www.hdfgroup.org/downloads/hdfview/) can be utilized to view the contents of the HDF5 files.
205241

206-
207-
208-
**Note:** Initially, we will utilize a lung dataset from [TCIA](https://wiki.cancerimagingarchive.net/display/Public/NSCLC-Radiomics). The original DICOM CT images and structure sets are not included in the PortPy dataset and need to be directly downloaded from the TCIA. Users can fetch the **TCIA collection ID** and the **TCIA subject ID** for each PortPy patient using the *get_tcia_metadata()* method in PortPy and subsequently download the data from TCIA (see [imrt_tps_import](https://github.com/PortPy-Project/PortPy/blob/master/examples/imrt_tps_import.ipynb))
242+
**Note:** Initially, we will utilize a lung and prostate dataset from [TCIA](https://wiki.cancerimagingarchive.net/display/Public/NSCLC-Radiomics). Users can fetch the **TCIA collection ID** and the **TCIA subject ID** for each PortPy patient using the *get_tcia_metadata()* method in PortPy and subsequently download the data from TCIA (see [imrt_tps_import](https://github.com/PortPy-Project/PortPy/blob/master/examples/imrt_tps_import.ipynb))
209243

210244

211245
# Installation <a name="Installation"></a>

examples/1_basic_tutorial.ipynb

Lines changed: 59 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -89,21 +89,52 @@
8989
"id": "655614ea-9dc9-4983-8468-0d9c3e97273b",
9090
"metadata": {},
9191
"source": [
92-
"#### Loading Data with `DataExplorer`\n",
9392
"\n",
94-
"You can initialize `DataExplorer` in one of two ways:\n",
9593
"\n",
96-
"- 📥 Using a Hugging Face repo ID (`hf_repo_id`) — data is downloaded if not already present.\n",
97-
"- 📁 Using a local directory (`data_dir`) — useful if you've already downloaded the dataset.\n",
94+
"#### Downloading and Loading the Data with `DataExplorer`\n",
95+
"\n",
96+
" You can also browse patient metadata using our interactive [Hugging Face Space](https://huggingface.co/spaces/PortPy-Project/portpy_dataset_visualization). This web app allows you to visually explore patients and their metadata\n",
97+
"\n",
98+
"Once you identify the patient(s) of interest, you can download their data using the CLI and python function available in PortPy.\n",
99+
"👇 **You can choose any of the below option.**\n",
100+
"\n",
101+
"Option 1: Using CLI command `download_portpy_data` to download data for specific patient(s):\n",
102+
"```bash\n",
103+
"download_portpy_data --patients Lung_Patient_3 Lung_Patient_4 --beam-mode planner --out ./\n",
104+
"```\n",
105+
"\n",
106+
"--patients: list of patient ids\n",
107+
"--beam-mode: options: \"all\" or \"planner\" or beam ids list e.g., 0 10 20. Default is \"planner\"\n",
108+
"--out: output directory\n",
98109
"\n",
99-
" You can also browse and download patient data manually using our interactive [Hugging Face Space](https://huggingface.co/spaces/PortPy-Project/portpy_dataset_visualization):\n",
100-
" This web app allows you to:\n",
101-
" - Visually explore patients and their metadata\n",
102-
" - Download selected patient datasets\n",
103-
" \n",
104-
" Once downloaded, you can load the data locally\n",
105110
"\n",
106-
"👇 **You can choose any of the below option.**"
111+
"Option 2: Using PortPy function `download_portpy_data` to download data for specific patient(s):\n",
112+
"```bash\n",
113+
"pp.download_portpy_data(\n",
114+
" patients=[\"Lung_Patient_3\", \"Lung_Patient_4\"], # list of patient ids\n",
115+
" out=\"./\", # output directory\n",
116+
" beam_mode=\"planner\", # options: \"all\" or \"planner\" or beam ids list e.g., [0,10,20]. Default is \"planner\"\n",
117+
")\n",
118+
"```\n",
119+
"Now your data should be located here ./data/Lung_Patient_3. We recommend storing patient data under a data/ directory at the project root, but this is not required.\n",
120+
"```kotlin\n",
121+
"<project-root>/\n",
122+
"├── data/\n",
123+
"│ ├── Lung_Patient_3/\n",
124+
"│ ├── Lung_Patient_4/\n",
125+
"│ └── ...\n",
126+
"├── examples/\n",
127+
"│ └── example_*.py\n",
128+
"├── portpy/\n",
129+
"└── README.md\n",
130+
"```\n",
131+
"\n",
132+
"#### Initializing `DataExplorer`\n",
133+
"Now you can initialize `DataExplorer` in one of two ways:\n",
134+
"\n",
135+
"- 📁 Using a local directory (`data_dir`) — useful if you've already downloaded the dataset.\n",
136+
"- 📥 Using a Hugging Face repo ID (`hf_repo_id`) — data is not yet downloaded.\n",
137+
"\n"
107138
]
108139
},
109140
{
@@ -196,12 +227,22 @@
196227
}
197228
],
198229
"source": [
199-
"# If you have already downloaded data, you can specify the patient data location and create data explorer.\n",
200-
"# data_dir = r'../../data'\n",
201-
"# data = pp.DataExplorer(data_dir=data_dir)\n",
230+
"# ------------------------------------------------------------\n",
231+
"# Option 1: Load data from a local directory (recommended if\n",
232+
"# you have already downloaded patient data)\n",
233+
"# ------------------------------------------------------------\n",
234+
"data_dir = r'../../data'\n",
235+
"data = pp.DataExplorer(data_dir=data_dir)\n",
202236
"\n",
203-
"# Use PortPy DataExplorer class to explore PortPy data\n",
204-
"data = pp.DataExplorer(hf_repo_id=\"PortPy-Project/PortPy_Dataset\", local_download_dir='../hugging_face_data')\n",
237+
"# ------------------------------------------------------------\n",
238+
"# Option 2: Explore the full PortPy dataset directly from\n",
239+
"# Hugging Face (metadata access; data downloaded on demand)\n",
240+
"# ------------------------------------------------------------\n",
241+
"# Uncomment the following line if you have NOT downloaded data locally\n",
242+
"# data = pp.DataExplorer(\n",
243+
"# hf_repo_id=\"PortPy-Project/PortPy_Dataset\",\n",
244+
"# local_download_dir=\"../hugging_face_data\"\n",
245+
"# )\n",
205246
"\n",
206247
"# display the list of patients available in portpy dataset\n",
207248
"df = data.display_list_of_patients(return_df=True)\n",
@@ -359,9 +400,8 @@
359400
"# pick a patient from the existing patient list to get detailed info (e.g., beam angles, structures).\n",
360401
"data.patient_id = 'Lung_Patient_3'\n",
361402
"\n",
362-
"# download patient data for only expert selected beams from hugging face. \n",
363-
"# Users can download all beams by using use_planner_beams_only = False.\n",
364-
"data.filter_and_download_hf_dataset()\n",
403+
"# you can also directly download above patient data on demand from hugging face. Users can download all beams by using use_planner_beams_only = False.\n",
404+
"# data.filter_and_download_hf_dataset()\n",
365405
"\n",
366406
"# display the data of the patient \n",
367407
"# user can get the results back in the panda dataframe format by using the arguments 'return_beams_df' and 'return_structs_df'\n",

portpy/config_files/clinical_criteria/Default/Lung_2Gy_30Fx.json

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@
77
{
88
"type": "max_dose",
99
"parameters": {
10-
"structure_name": "GTV"
10+
"structure_name": "GTV",
11+
"weight": 50
1112
},
1213
"constraints": {
1314
"limit_dose_gy": 69,
@@ -17,7 +18,8 @@
1718
{
1819
"type": "max_dose",
1920
"parameters": {
20-
"structure_name": "PTV"
21+
"structure_name": "PTV",
22+
"weight": 50
2123
},
2224
"constraints": {
2325
"limit_dose_gy": 69,
@@ -36,7 +38,8 @@
3638
{
3739
"type": "mean_dose",
3840
"parameters": {
39-
"structure_name": "ESOPHAGUS"
41+
"structure_name": "ESOPHAGUS",
42+
"weight": 50
4043
},
4144
"constraints": {
4245
"limit_dose_gy": 34,
@@ -65,7 +68,8 @@
6568
{
6669
"type": "mean_dose",
6770
"parameters": {
68-
"structure_name": "HEART"
71+
"structure_name": "HEART",
72+
"weight": 50
6973
},
7074
"constraints": {
7175
"limit_dose_gy": 27,
@@ -86,7 +90,8 @@
8690
"type": "dose_volume_V",
8791
"parameters": {
8892
"structure_name": "HEART",
89-
"dose_gy": 30
93+
"dose_gy": 30,
94+
"weight": 50
9095
},
9196
"constraints": {
9297
"goal_volume_perc": 48
@@ -113,7 +118,8 @@
113118
{
114119
"type": "max_dose",
115120
"parameters": {
116-
"structure_name": "CORD"
121+
"structure_name": "CORD",
122+
"weight": 50
117123
},
118124
"constraints": {
119125
"limit_dose_gy": 50,
@@ -143,7 +149,8 @@
143149
"type": "mean_dose",
144150
"parameters": {
145151
"structure_name": "LUNGS_NOT_GTV",
146-
"structure_def": "(LUNG_L | LUNG_R) - GTV"
152+
"structure_def": "(LUNG_L | LUNG_R) - GTV",
153+
"weight": 50
147154
},
148155
"constraints": {
149156
"limit_dose_gy": 21,

portpy/photon/clinical_criteria.py

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -272,8 +272,6 @@ def get_dvh_table(self, my_plan: Plan, constraint_list: list = None, opt_params:
272272
count = 0
273273
for i in range(len(dvh_updated_list)):
274274

275-
dvh_method = dvh_updated_list[i]['parameters'].get('dvh_method', None)
276-
277275
if 'dose_volume_V' in dvh_updated_list[i]['type']:
278276
limit_key = self.matching_keys(dvh_updated_list[i]['constraints'], 'limit')
279277
dose_key = self.matching_keys(dvh_updated_list[i]['parameters'], 'dose_')
@@ -282,7 +280,7 @@ def get_dvh_table(self, my_plan: Plan, constraint_list: list = None, opt_params:
282280
df.at[count, 'dose_gy'] = self.dose_to_gy(dose_key, dvh_updated_list[i]['parameters'][dose_key])
283281
df.at[count, 'volume_perc'] = dvh_updated_list[i]['constraints'][limit_key]
284282
df.at[count, 'dvh_type'] = 'constraint'
285-
df.at[count, 'dvh_method'] = dvh_method
283+
df.at[count, 'dvh_method'] = dvh_updated_list[i]['parameters'].get('dvh_method', None)
286284
df.at[count, 'bound_type'] = dvh_updated_list[i]['constraints'].get('bound_type', 'upper')
287285
count = count + 1
288286
goal_key = self.matching_keys(dvh_updated_list[i]['constraints'], 'goal')
@@ -291,8 +289,8 @@ def get_dvh_table(self, my_plan: Plan, constraint_list: list = None, opt_params:
291289
df.at[count, 'dose_gy'] = self.dose_to_gy(dose_key, dvh_updated_list[i]['parameters'][dose_key])
292290
df.at[count, 'volume_perc'] = dvh_updated_list[i]['constraints'][goal_key]
293291
df.at[count, 'dvh_type'] = 'goal'
294-
df.at[count, 'dvh_method'] = dvh_method
295-
df.at[count, 'weight'] = dvh_updated_list[i]['parameters']['weight']
292+
df.at[count, 'dvh_method'] = dvh_updated_list[i]['parameters'].get('dvh_method', None)
293+
df.at[count, 'weight'] = dvh_updated_list[i]['parameters'].get('weight', 5) # default weight 5
296294
df.at[count, 'bound_type'] = dvh_updated_list[i]['constraints'].get('bound_type', 'upper')
297295
count = count + 1
298296
if 'dose_volume_D' in dvh_updated_list[i]['type']:
@@ -301,7 +299,7 @@ def get_dvh_table(self, my_plan: Plan, constraint_list: list = None, opt_params:
301299
df.at[count, 'structure_name'] = dvh_updated_list[i]['parameters']['structure_name']
302300
df.at[count, 'volume_perc'] = dvh_updated_list[i]['parameters']['volume_perc']
303301
df.at[count, 'dose_gy'] = self.dose_to_gy(limit_key, dvh_updated_list[i]['constraints'][limit_key])
304-
df.at[count, 'dvh_method'] = dvh_method
302+
df.at[count, 'dvh_method'] = dvh_updated_list[i]['parameters'].get('dvh_method', None)
305303
df.at[count, 'dvh_type'] = 'constraint'
306304
df.at[count, 'bound_type'] = dvh_updated_list[i]['constraints'].get('bound_type', 'upper')
307305
count = count + 1
@@ -310,9 +308,9 @@ def get_dvh_table(self, my_plan: Plan, constraint_list: list = None, opt_params:
310308
df.at[count, 'structure_name'] = dvh_updated_list[i]['parameters']['structure_name']
311309
df.at[count, 'volume_perc'] = dvh_updated_list[i]['parameters']['volume_perc']
312310
df.at[count, 'dose_gy'] = self.dose_to_gy(goal_key, dvh_updated_list[i]['constraints'][goal_key])
313-
df.at[count, 'dvh_method'] = dvh_method
311+
df.at[count, 'dvh_method'] = dvh_updated_list[i]['parameters'].get('dvh_method', None)
314312
df.at[count, 'dvh_type'] = 'goal'
315-
df.at[count, 'weight'] = dvh_updated_list[i]['parameters']['weight']
313+
df.at[count, 'weight'] = dvh_updated_list[i]['parameters'].get('weight', 5) # default weight 5
316314
df.at[count, 'bound_type'] = dvh_updated_list[i]['constraints'].get('bound_type', 'upper')
317315
count = count + 1
318316
self.dvh_table = df

0 commit comments

Comments
 (0)