|
89 | 89 | "id": "655614ea-9dc9-4983-8468-0d9c3e97273b", |
90 | 90 | "metadata": {}, |
91 | 91 | "source": [ |
92 | | - "#### Loading Data with `DataExplorer`\n", |
93 | 92 | "\n", |
94 | | - "You can initialize `DataExplorer` in one of two ways:\n", |
95 | 93 | "\n", |
96 | | - "- 📥 Using a Hugging Face repo ID (`hf_repo_id`) — data is downloaded if not already present.\n", |
97 | | - "- 📁 Using a local directory (`data_dir`) — useful if you've already downloaded the dataset.\n", |
| 94 | + "#### Downloading and Loading the Data with `DataExplorer`\n", |
| 95 | + "\n", |
| 96 | + " You can also browse patient metadata using our interactive [Hugging Face Space](https://huggingface.co/spaces/PortPy-Project/portpy_dataset_visualization). This web app allows you to visually explore patients and their metadata\n", |
| 97 | + "\n", |
| 98 | + "Once you identify the patient(s) of interest, you can download their data using the CLI and python function available in PortPy.\n", |
| 99 | + "👇 **You can choose any of the below option.**\n", |
| 100 | + "\n", |
| 101 | + "Option 1: Using CLI command `download_portpy_data` to download data for specific patient(s):\n", |
| 102 | + "```bash\n", |
| 103 | + "download_portpy_data --patients Lung_Patient_3 Lung_Patient_4 --beam-mode planner --out ./\n", |
| 104 | + "```\n", |
| 105 | + "\n", |
| 106 | + "--patients: list of patient ids\n", |
| 107 | + "--beam-mode: options: \"all\" or \"planner\" or beam ids list e.g., 0 10 20. Default is \"planner\"\n", |
| 108 | + "--out: output directory\n", |
98 | 109 | "\n", |
99 | | - " You can also browse and download patient data manually using our interactive [Hugging Face Space](https://huggingface.co/spaces/PortPy-Project/portpy_dataset_visualization):\n", |
100 | | - " This web app allows you to:\n", |
101 | | - " - Visually explore patients and their metadata\n", |
102 | | - " - Download selected patient datasets\n", |
103 | | - " \n", |
104 | | - " Once downloaded, you can load the data locally\n", |
105 | 110 | "\n", |
106 | | - "👇 **You can choose any of the below option.**" |
| 111 | + "Option 2: Using PortPy function `download_portpy_data` to download data for specific patient(s):\n", |
| 112 | + "```bash\n", |
| 113 | + "pp.download_portpy_data(\n", |
| 114 | + " patients=[\"Lung_Patient_3\", \"Lung_Patient_4\"], # list of patient ids\n", |
| 115 | + " out=\"./\", # output directory\n", |
| 116 | + " beam_mode=\"planner\", # options: \"all\" or \"planner\" or beam ids list e.g., [0,10,20]. Default is \"planner\"\n", |
| 117 | + ")\n", |
| 118 | + "```\n", |
| 119 | + "Now your data should be located here ./data/Lung_Patient_3. We recommend storing patient data under a data/ directory at the project root, but this is not required.\n", |
| 120 | + "```kotlin\n", |
| 121 | + "<project-root>/\n", |
| 122 | + "├── data/\n", |
| 123 | + "│ ├── Lung_Patient_3/\n", |
| 124 | + "│ ├── Lung_Patient_4/\n", |
| 125 | + "│ └── ...\n", |
| 126 | + "├── examples/\n", |
| 127 | + "│ └── example_*.py\n", |
| 128 | + "├── portpy/\n", |
| 129 | + "└── README.md\n", |
| 130 | + "```\n", |
| 131 | + "\n", |
| 132 | + "#### Initializing `DataExplorer`\n", |
| 133 | + "Now you can initialize `DataExplorer` in one of two ways:\n", |
| 134 | + "\n", |
| 135 | + "- 📁 Using a local directory (`data_dir`) — useful if you've already downloaded the dataset.\n", |
| 136 | + "- 📥 Using a Hugging Face repo ID (`hf_repo_id`) — data is not yet downloaded.\n", |
| 137 | + "\n" |
107 | 138 | ] |
108 | 139 | }, |
109 | 140 | { |
|
196 | 227 | } |
197 | 228 | ], |
198 | 229 | "source": [ |
199 | | - "# If you have already downloaded data, you can specify the patient data location and create data explorer.\n", |
200 | | - "# data_dir = r'../../data'\n", |
201 | | - "# data = pp.DataExplorer(data_dir=data_dir)\n", |
| 230 | + "# ------------------------------------------------------------\n", |
| 231 | + "# Option 1: Load data from a local directory (recommended if\n", |
| 232 | + "# you have already downloaded patient data)\n", |
| 233 | + "# ------------------------------------------------------------\n", |
| 234 | + "data_dir = r'../../data'\n", |
| 235 | + "data = pp.DataExplorer(data_dir=data_dir)\n", |
202 | 236 | "\n", |
203 | | - "# Use PortPy DataExplorer class to explore PortPy data\n", |
204 | | - "data = pp.DataExplorer(hf_repo_id=\"PortPy-Project/PortPy_Dataset\", local_download_dir='../hugging_face_data')\n", |
| 237 | + "# ------------------------------------------------------------\n", |
| 238 | + "# Option 2: Explore the full PortPy dataset directly from\n", |
| 239 | + "# Hugging Face (metadata access; data downloaded on demand)\n", |
| 240 | + "# ------------------------------------------------------------\n", |
| 241 | + "# Uncomment the following line if you have NOT downloaded data locally\n", |
| 242 | + "# data = pp.DataExplorer(\n", |
| 243 | + "# hf_repo_id=\"PortPy-Project/PortPy_Dataset\",\n", |
| 244 | + "# local_download_dir=\"../hugging_face_data\"\n", |
| 245 | + "# )\n", |
205 | 246 | "\n", |
206 | 247 | "# display the list of patients available in portpy dataset\n", |
207 | 248 | "df = data.display_list_of_patients(return_df=True)\n", |
|
359 | 400 | "# pick a patient from the existing patient list to get detailed info (e.g., beam angles, structures).\n", |
360 | 401 | "data.patient_id = 'Lung_Patient_3'\n", |
361 | 402 | "\n", |
362 | | - "# download patient data for only expert selected beams from hugging face. \n", |
363 | | - "# Users can download all beams by using use_planner_beams_only = False.\n", |
364 | | - "data.filter_and_download_hf_dataset()\n", |
| 403 | + "# you can also directly download above patient data on demand from hugging face. Users can download all beams by using use_planner_beams_only = False.\n", |
| 404 | + "# data.filter_and_download_hf_dataset()\n", |
365 | 405 | "\n", |
366 | 406 | "# display the data of the patient \n", |
367 | 407 | "# user can get the results back in the panda dataframe format by using the arguments 'return_beams_df' and 'return_structs_df'\n", |
|
0 commit comments