The repository contains usage instructions for CTNorm: a toolkit that offer modules for data understanding, image harmonization, and model performance evaluation across different datasets.
These instructions will guide you through setting up and using CTNorm.
Currently, CTNorm supports only DICOM (.dcm) format.
- A CSV file is required that must contain a column named
uids, where each row corresponds to the path of a DICOM folder.
| uids |
|---|
/data/path/to/dicom/folder1 |
/data/path/to/dicom/folder2 |
/data/path/to/dicom/folder3 |
To ensure easy setup and reproducibility, we recommend running CTNorm using Docker.
1οΈβ£ Clone the CTNorm repository:
git clone https://github.com/hsu-lab/ctnorm.git2οΈβ£ Pull the Docker image from Docker Hub π³:
docker pull litou/mii-pytorch:20.113οΈβ£ Run Docker in interactive mode
docker run --name=<container_name> --shm-size=<memory_size> --gpus device=<gpu_id> -it --rm -p <port_number>:<port_number> -v /etc/localtime:/etc/localtime:ro -v "$(pwd)":/workspace -v <path_to_input_data>:/data litou/mii-pytorch:20.11Parameters:
- <container_name>: Specify the name of the container.
- <memory_size>: Specify the shared memory size (e.g.,
2g,4g,6g). - <gpu_id>: Specify the gpu device id. If no GPU is available, this parameter can be omitted.
- <port_number>: Specify a port number in Docker. This is required when running the Flask component of the toolkit.
- <path_to_input_data>: Path to the input data directory, which will be mounted as
/datain the container.
π¨ Note: The paths specified in the CSV file must be accessible within the mounted Docker container.
4οΈβ£ Install the CTNorm package locally:
cd /workspace/ctnorm # Move to the project directory
pip install -e .The CTNorm pipeline requires a YAML configuration file to define parameters needed to run each module. Below is a breakdown of each section in config.yaml.
Global:
session_base_path: "./SESSIONS" # Base path where all session folders will be created; each run creates a new sessionDatasets:
NLST:
in_uids: "/path_to_nlst_data_cases.csv" # Path to a CSV file containing UIDs (references to DICOM folders)
in_dtype: ".dcm" # Specifies input data type. Must be in DICOM format at the moment
description: "National Lung Screening Trial dataset" # Descriptive name for the dataset
SPIE:
in_uids: "/path_to_spie_data_cases.csv"
in_dtype: ".dcm"
description: "SPIE LungX dataset"
- Each dataset must have
in_uids,in_dtype, anddescriptionfields. - The dataset name (e.g.,
NLST,SPIE) is a user-defined key and should be unique.
Modules:
Characterization: true
Harmonization: true
Robustness: true- Set
trueorfalseto enable/disable a module.
Characterization:
input_datasets:
- name: NLST # β
Valid - Must match a dataset in the `Datasets` section
- name: SPIE # β
Valid - Must match a dataset in the `Datasets` section
- name: XYZ # β Invalid - Doesn't match any dataset in the `Datasets` section
metrics:
voxel:
- all
metadata:
- all
params:
clip_range : [-1024, 3071]
bins: 64
kde_points: 1e3
kde_sample: 1e5- The dataset name specified in
input_datasetsmust match one of key defined in theDatasetssection. - If multiple datasets are provided, each must be listed separately under input_datasets.
- The
metricsfield defines the types of properties that will be characterized in the dataset. It includes voxel statistics and metadata properties.
| Available Options | Category | Description |
|---|---|---|
| histogram | voxel | Generates an intensity histogram to analyze intensity distribution. |
| kde | voxel | Computes Kernel Density Estimation (KDE) for voxel intensities. |
| snr | voxel | Measures the ratio of signal intensity to noise, indicating image quality. |
| skewness | voxel | Calculates the asymmetry of the voxel intensity distribution. |
| kurtosis | voxel | Measures the "tailedness" of the intensity distribution. |
| all | voxel | Includes all the above voxel metrics. |
| slice_thickness | metadata | Analyzes the distribution of slice thickness across the dataset. |
| convolution_kernel | metadata | Summarizes the convolution kernels used during image reconstruction. |
| manufacturer | metadata | Lists the equipment manufacturers to assess scanner variability. |
| all | metadata | Includes all the above metadata metrics. |
-
π¨ Note: We plan to expand the available
metricsto includeradiomicfeature extraction in an upcoming update. -
For voxel-level analysis, the following
paramscan be set:clip_range: Sets the intensity value range for voxel-level analysis.[-1024, 3071]β Analyzes voxel intensities only within this specified range.
bins: Defines the number of bins for histogram computation.64β The histogram will use 64 bins for distribution analysis.
kde_points: Specifies the number of points for Kernel Density Estimation (KDE) curve generation.1e3β Uses 1,000 points to compute the KDE curve.
kde_sample: Sets the sample size for KDE estimation.1e5β Uses 100,000 voxel samples for KDE calculations.
The Harmonization module can run in two different modes.
1οΈβ£ Running Harmonization in Test Mode β Uses a pretrained model to harmonize input datasets.
Harmonization:
mode: "test" # Runs harmonization in inference mode
input_datasets:
- name: NLST # β
Valid - Must match a dataset in the `Datasets` section
in_uids: "/path_to_nlst_subset_data.csv" # πΉ (Optional) Overrides the `in_uids` from the `Datasets` section if specified
models:
- name: SNGAN # The model being used
pretrained_G: "./pretrained_weights/SNGAN/latestG-1-1.pth" # Path to the pretrained generator model
param:
tile_xy: 512 # Tile size along X & Y (Keep 512 during inference)
tile_z: 32 # Tile size along Z
z_overlap: 4 # Overlap between slices
gpu_id: 0 # GPU device ID
out_dtype: ".dcm" # Output data format (must be either .nii.gz or .dcm)
save_lr: true
metrics:
- snr
- sobel
- radiomic - The Harmonization module supports multiple models for CT harmonization. Below are the available model options:
| Model Name | Description |
|---|---|
| SNGAN | Spectral Normalization GAN, used for image-to-image translation. |
| WGAN | Wasserstein GAN, improves stability of training for generative models. |
| Pix2Pix | Conditional GAN, useful for paired image transformation. |
| SRResNet | Super-Resolution ResNet, designed for image enhancement. |
| RRDB | Residual-in-Residual Dense Block, used in SRGAN-style super-resolution tasks. |
| BM3D | A non-deep learning method for denoising images. |
- We have provided the pretrained weights here. Update the
pretrained_Gparameter depending on the model accordingly.
For BM3D, only one optional parameter can be specified; other parameters are not needed:
models:
- name: BM3D
param:
noise_type: "psd" # Optional, choose between "psd" or "std"- To evaluate the effectivness of harmonization, the following
metricscan be computed:
| Metric | Description |
|---|---|
snr (Signal-to-Noise Ratio) |
Measures the clarity of the image signal relative to background noise. A higher SNR indicates better image quality with less noise. |
sobel (Sobel Edge Detection) |
Applies a Sobel filter to evaluate edge sharpness, ensuring that anatomical boundaries are preserved. |
radiomic (Radiomic Feature Analysis) |
Extracts radiomic features to assess intensity, and texture characteristics for image-derived features. |
- π¨ Note: We plan to expand the available
metricsto includetsneanalysis in an upcoming update.
2οΈβ£ Want to train your own harmonization model from scratch ? Follow the steps outlined here.
We currently have Sybil model integrated as part of the robustness analysis module. It is a deep learning model developed to analyze chest CT scans and predict an individual's risk of developing lung cancer over multiple time horizons, including 1-year, 2-year, and 6-year periods. Read more about it here.
Robustness:
input_datasets:
- name: NLST # β
Valid - Must match a dataset in the `Datasets` section
in_uids: "/path_to_nlst_subset_data.csv" # πΉ (Optional) Overrides the `in_uids` from the `Datasets` section if specified
variability:
- manufacturer
- slice_thickness
- convolution_kernel
param:
model_type: "sybil_ensemble" # Options: sybil_1, sybil_2, sybil_3, sybil_4, sybil_5, sybil_ensemble
evaluate: true # Requires 'label' and 'time_to_event' columns in `in_uids` CSV. If set to false, it will save the predicted scores.variabilitydefines the imaging variation to be assessed, as identified in Characterization module.- If not specified, it will run Sybil on all cases specified in
in_uids.
π¨ Note: Defined variability must exist in the generatedmetadata_characterization.csvfile. - If the Robustness module is run at a different time (not together with the Characterization module), a
load_fromparameter must be specified to load previously generated characterization data csv as shown below:
Robustness:
input_datasets:
- name: NLST
in_uids: "/path_to_nlst_subset_data.csv"
variability:
- manufacturer
- slice_thickness
- convolution_kernel
load_from: "20250203-033504-40" # # πΉ (Optional) Specify the session number from which to load `metadata_characterization.csv`
param:
model_type: "sybil_ensemble"
evaluate: truectnorm --config config.yamlCTNorm also offers a user-friendly interface to visualize the outputs of each session.
π¨ Note: The visualization component is continuously evolving, with new metric outputs and their visualizations being supported over time!
- To launch the app:
ctnorm-webapp --port <port_number> --session-out <path_to_session_folder>Parameters:
- <port_number>: Specify the port exposed in the Docker container. This must match the port defined when running Docker (Step 2: 3οΈβ£ Running Docker in interactive mode).
- <path_to_session_folder>: Provide the path to the folder where all session outputs are stored. This should be the same session directory used when running the CTNorm Toolkit.
Accessing the Web App on a Local Machine at:
http://localhost:<port_number>Accessing the Web App on a Remote Machine at:
http://<remote_server_ip>:<port_number>The Flask application displays the status of all sessions. Sessions that are complete can be started.
π¨ Popup Blocker Warning for Harmonization Viewer:
The image viewer from the Harmonization tab opens in a popup window. Some browsers may block popups by default. If you see a popup blocked notification, allow it to ensure the viewer opens correctly.
π§ Work in Progress π§
More features coming soon!
