Skip to content

Commit 435307e

Browse files
authored
Merge pull request #99 from ImagingDataCommons/claude-code-cleanup
cleanup/improvements by claude
2 parents 469e3be + 8918c53 commit 435307e

32 files changed

Lines changed: 20176 additions & 20116 deletions

.gitignore

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,10 @@ __pycache__/
77
.idea/*
88

99

10-
gha-creds-**.json
10+
gha-creds-**.json
11+
12+
# Downloaded DICOM files
13+
*.dcm
14+
15+
# Test artifacts
16+
rms_test/

README.md

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This repository contains tutorial materials (for the most part, as Python notebo
44

55
If this is the first time you hear about IDC, you may want to check out our [Getting Started documentation page](https://learn.canceridc.dev/getting-started-with-idc). Here are some highlights about what IDC has to offer:
66

7-
* **>93 TB of data**: IDC contains radiology, brightfield (H&E) and fluorescence slide microscopy images, along with image-derived data (annotations, segmentations, quantitative measurements) and accompanying clinical data
7+
* **~100 TB of data**: IDC contains radiology, brightfield (H&E) and fluorescence slide microscopy images, along with image-derived data (annotations, segmentations, quantitative measurements) and accompanying clinical data
88

99
* **free**: all of the data in IDC is publicly available: no registration, no access requests
1010

@@ -16,22 +16,40 @@ If this is the first time you hear about IDC, you may want to check out our [Get
1616

1717
The tutorial notebooks are located in the [notebooks](https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks), and are organized in the following folders.
1818

19+
## Recommended Learning Paths
20+
21+
**Beginner (no GCP account needed):**
22+
1. [Basics of using IDC data programmatically](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/part2_searching_basics.ipynb) - start here!
23+
2. [Searching clinical data](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/exploring_clinical_data.ipynb)
24+
25+
**Intermediate (GCP account needed):**
26+
3. [BigQuery prerequisites and setup](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/part1_prerequisites.ipynb) - set up Google Cloud for BigQuery access
27+
4. [Advanced searching using BigQuery](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/part3_exploring_cohorts.ipynb)
28+
5. [Working with IDC clinical data](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/advanced_topics/clinical_data_intro.ipynb)
29+
30+
**Domain-specific paths:**
31+
* **Radiology**: [NLST exploration](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/collections_demos/nlst_exploration.ipynb)[TotalSegmentator features](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/collections_demos/TotalSegmentator_CT_Segmentations_features_extraction.ipynb)[MedSAM on IDC](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/analysis/MedSAM_with_IDC.ipynb)
32+
* **Digital Pathology**: [Getting started with digital pathology](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/pathomics/getting_started_with_digital_pathology.ipynb)[Slide microscopy metadata](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/pathomics/slide_microscopy_metadata_search.ipynb)[DICOM annotations](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/pathomics/microscopy_dicom_ann_intro.ipynb)
33+
* **Cloud Access**: [Direct GCS/AWS access](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/advanced_topics/gcs_aws_direct_access.ipynb)[DICOMweb access](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/advanced_topics/idc_dicomweb_access.ipynb)[IDC on AWS](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/advanced_topics/idc_on_aws/idc-on-aws-tutorial.ipynb)
34+
1935
## [`getting_started`](https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/getting_started)
2036

21-
"Getting Started" python notebooks are intended to introduce the users to IDC.
37+
"Getting Started" python notebooks are intended to introduce the users to IDC.
2238

23-
* [Basics of using IDC data programmatically](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/part2_searching_basics.ipynb): learn how to use `idc-index` python package to programmatically search and download IDC data, visualize images and annotations, build cohorts and checking acknowledgments and liceses for the data included in your cohort.
39+
* [Basics of using IDC data programmatically](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/part2_searching_basics.ipynb): learn how to use `idc-index` python package to programmatically search and download IDC data, visualize images and annotations, build cohorts and checking acknowledgments and licenses for the data included in your cohort.
2440
* [Searching clinical data](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/exploring_clinical_data.ipynb): identify clinical and other non-imaging data accompanying imaging collections in IDC using `idc-index` python package and `duckdb`.
41+
* [BigQuery prerequisites and setup](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/part1_prerequisites.ipynb): set up the prerequisites for working with IDC data via Google BigQuery. This notebook is only needed if you plan to use BigQuery for advanced metadata queries (Part 3 and beyond).
2542
* [Advanced searching using BigQuery](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/getting_started/part3_exploring_cohorts.ipynb): access all of the metadata to build comprehensive queries and detailed cohort selection criteria.
2643

2744
## [`advanced_topics`](https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/advanced_topics)
2845

29-
Notebooks in this folder focus on topics that will require understanding of the basics, and aim to address more narrow use cases of IDC usage.
46+
Notebooks in this folder focus on topics that will require understanding of the basics, and aim to address more narrow use cases of IDC usage.
3047

3148
* [Searching DICOM private tags](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/advanced_topics/dicom_private_tags_intro.ipynb): all of DICOM attributes for the imaging data in IDC are searchable using BigQuery. DICOM private tags often contain critical information, such as diffusion b-values, but are a bit more tricky to access from BigQuery. In this tutorial you will learn how to accomplish this.
3249
* [Working with IDC clinical data](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/advanced_topics/clinical_data_intro.ipynb): this tutorial will help you become familiar with how clinical and other tabular data accompanying images in IDC is organized, and how you can use it to build cohorts.
3350
* [Direct access to tags/frames from GCS/AWS buckets](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/advanced_topics/gcs_aws_direct_access.ipynb): learn how to access individual frames or tags of large DICOM files from the bucket without having to download the entire file (this notebook accompanies documentation article here: https://learn.canceridc.dev/data/downloading-data/direct-loading)
3451
* [Using DICOMweb to access IDC data](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/advanced_topics/idc_dicomweb_access.ipynb): both metadata and pixel data can be access using DICOMweb, which is particularly important while working with digital pathology, as it enables granular access to the individual pyramid tiles (frames)
52+
* [IDC on AWS](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/advanced_topics/idc_on_aws/idc-on-aws-tutorial.ipynb): learn how to work with IDC data using AWS services, including AWS HealthImaging.
3553

3654
## [`viewers_deployment`](https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/viewers_deployment)
3755

@@ -41,9 +59,9 @@ These notebooks can be used to deploy your own cloud-based instance of OHIF or S
4159
* [Slim Firebase deployment](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/viewers_deployment/slim_Firebase_deployment.ipynb)
4260
* [Setting up Google Healthcare DICOM store](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/viewers_deployment/Creating_Google_Healthcare_DICOM_store.ipynb): once you have your viewers deployed, you can use this tutorial to create a DICOM store with your data, which you can then access from the viewers deployed
4361

44-
## [`collectons_demos`](https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/collections_demos)
62+
## [`collections_demos`](https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/collections_demos)
4563

46-
This folders contains notebooks that demonstrate the usage of the data in the specific IDC collections. The notebooks in this folder will always have the prefix of the `collection_id` they correspond to, for easier navigation.
64+
This folder contains notebooks that demonstrate the usage of the data in the specific IDC collections. The notebooks in this folder will always have the prefix of the `collection_id` they correspond to, for easier navigation.
4765

4866
* [Using hiplot for exploring prostate MRI in IDC](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/collections_demos/prostate-MRI_hiplot_experiments.ipynb): this notebook demonstrates how [`hiplot`](https://facebookresearch.github.io/hiplot/), an open source package for high-dimensional parameter visualization, for examining various MRI acquisition parameters for the prostate MRI images available in IDC.
4967
* [Visible Human Project exploration](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/collections_demos/nlm_visible_human_project.ipynb): demonstration of searching and visualizing images from the National Library of Medicine Visible Human Project available on IDC.
@@ -56,25 +74,30 @@ This folders contains notebooks that demonstrate the usage of the data in the sp
5674
This folder is dedicated to the notebooks focused on the digital pathology (pathomics) applications. The use of DICOM standard is relatively new in digital pathology, and this field is being actively developed, thus a dedicated folder for this.
5775
* [Getting started with pathology images in IDC](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/pathomics/getting_started_with_digital_pathology.ipynb): all of the pathology images in IDC are in DICOM Slide Microscopy format; this notebook will help you get started with using this representation and also searching IDC pathology images.
5876
* [Exploring IDC slide microscopy images metadata](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/pathomics/slide_microscopy_metadata_search.ipynb): introduction to the key metadata accompanying IDC slide microscopy images that can be used for subsetting data and building cohorts.
59-
* [Working with DICOM Microscopy Bulk Simple annotations](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/pathomics/Tutorial_MicroscopyBulkSimpleAnnotations.ipynb): introduction to working with bulk annotations encoded in DICOM using the example of nuclei annotations.
77+
* [Working with DICOM Microscopy Bulk Simple annotations](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/pathomics/microscopy_dicom_ann_intro.ipynb): introduction to working with bulk annotations encoded in DICOM using the example of nuclei annotations.
6078

6179
## [`analysis`](https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/analysis)
6280

6381
Demonstrations/examples of analyses of images from IDC.
6482
* [MedSAM on IDC](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/analysis/MedSAM_with_IDC.ipynb): learn how to experiment with MedSAM on the images available from IDC.
6583
* [MHub.ai with IDC data](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/analysis/mhubai_tutorial.ipynb): [MHub.ai](https://mhub.ai) is a platform for Deep Learning models in medical imaging, which are interoperable with IDC and can be applied directly to the IDC DICOM images. Learn how to get started from this notebook!
6684

85+
## [`idc_api`](https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/idc_api)
86+
87+
Notebooks demonstrating the IDC REST API.
88+
* [How to use the IDC V2 API](https://github.com/ImagingDataCommons/IDC-Tutorials/blob/master/notebooks/idc_api/How_to_use_the_IDC_V2_API.ipynb): introduction to the IDC REST API v2 for programmatic access to IDC data, including cohort creation, manifest generation, and data download.
89+
6790
## [`labs`](https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/labs)
6891

6992
Here you will find an archive of the notebooks that were used in tutorials, which at times may demonstrate experimental features. By design, the notebooks presented at specific events may not be updated after the event, and are stored in this folder for archival purposes.
7093

7194
## [`deprecated`](https://github.com/ImagingDataCommons/IDC-Tutorials/tree/master/notebooks/deprecated)
7295

73-
IDC is an actively evolving resource. As we develop new and improved capabilities, we improve our recommended usage practices, and may deprecate notebooks that are no longer maintained and may no longer work. You will find thse in the `deprecated` folder.
96+
IDC is an actively evolving resource. As we develop new and improved capabilities, we improve our recommended usage practices, and may deprecate notebooks that are no longer maintained and may no longer work. You will find these in the `deprecated` folder.
7497

7598
## `testing`
7699

77-
This directory is used for the maintenance of the repository to support testing of the actively supported notebooks.
100+
This directory is used for the maintenance of the repository to support testing of the actively supported notebooks.
78101

79102
# Support
80103

0 commit comments

Comments
 (0)