Skip to content

LianZifeng/BrainSeg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

103 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling

Official implementation code for BrainSeg. We propose a novel AI-based tool for comprehensive brain imaging segmentation with generalizability across multiple modalities, including MRI, CT, PET, and ultrasound, as well as across the lifespan (from fetuses to the elderly). This framework consists of three main components: B-Syn, B-CLIP, and BrainSeg.


Model overview

BrainSeg Framework

Results

Results

πŸ› οΈ Installation

To ensure a clean workspace and prevent dependency conflicts, we strongly recommend creating a new Conda environment before running the code.

1. Create and Activate Environment

# Create a new conda environment named 'BrainSeg' with Python 3.9 and install the required libraries
cd /your/path/to/this/repository
conda env create -f environment.yml -n BrainSeg

# Activate the environment
conda activate BrainSeg

Get started with B-Syn

We provide a demo script for immediate testing and usage of our B-Syn module.

πŸ§ͺ Quick Start

First, navigate to the B-Syn source directory:

cd ./BSyn/BSyn/

1. Multimodal Synthesis

You can synthesize images for different modalities by specifying the target output filename. Please refer to our function arguments in the code for a full list of supported modality parameters. For example, to generate a T2-weighted MRI image:

python BSyn_Demo.py --modality T2-brain.nii.gz

2. Lesion Synthesis

B-Syn supports the simulation of pathological features, such as tumors and strokes. You can control the pathology type using the --lesion_type argument.

Generate Tumor data

python BSyn_Demo.py --modality Flair-brain.nii.gz --lesion_type tumor

Generate Stroke data

python BSyn_Demo.py --modality DWI-brain.nii.gz --lesion_type stroke

Get started with B-CLIP

βš™οΈ Step 1: Set up the environment for BiomedCLIP

Our B-CLIP fine-tunes BiomedCLIP text encoder based on LoRA, so you need to first configure the Biomedical environment:

1. First clone the latest BiomedCLIP model (the commit version we used is 27005c2, and earlier versions may have compatibility issues)

mkdir BiomedCLIP
cd ./BiomedCLIP
git clone https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224

2. And then clone the latest BiomedBERT-abstract

mkdir BiomedBERT-abstract
cd ./BiomedBERT-abstract
git clone https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract

3. If you encounter network issues when running git clone, we also provide the already downloaded folders for convenience through the following links: BiomedCLIP and BiomedBERT-abstract.

4. To invoke your own local path of Biomedical, you need to make a little modification to the source code of open-clip. Please follow: mlfoundations/open_clip#772 (comment)

5. Finally modify the model configuration to enable the text encoder to output tokens. In /your/path/to/BiomedCLIP/open_clip_config.json, add the setting of output_tokens to the "text_cfg" dictionary

Before:

   ...
   "context_length": 256
}

After:

    ...
    "context_length": 256,
    "output_tokens": true
}

6. Now you can load the Biomedical model like this:

model, preprocess = open_clip.create_model_from_pretrained(
    'hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224',
    pretrained='/your/path/to/BiomedCLIP/open_clip_pytorch_model.bin',
    cache_dir='/your/path/to/BiomedCLIP')
tokenizer = open_clip.get_tokenizer(
    model_name='hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224',
    cache_dir='/your/path/to/BiomedCLIP')

πŸ“‚ Step 2: Prepare your data to train B-CLIP

You can organize your file directory as follows to train B-CLIP on your own data

prompt.xlsx                     # excel for text metadata
parameter.xlsx                  # excel for image parameter, used for data synthetic of B-Syn module
lesion_parameter.xlsx           # excel for lesion parameter, used for lesion synthetic of B-Syn module
data/
β”œβ”€β”€ subject001/
β”‚   β”œβ”€β”€ brain.nii.gz            # brain image
β”‚   β”œβ”€β”€ tissue.nii.gz           # tissue map GT
β”‚   β”œβ”€β”€ dk-struct.nii.gz        # roi map GT
β”‚   β”œβ”€β”€ T2-brain.nii.gz         # T2 modality image, if have
β”‚   β”œβ”€β”€ CT-brain.nii.gz         # CT modality image, if have
β”‚   β”œβ”€β”€ ……                      # other modality image, if have
β”œβ”€β”€ subject002 
β”œβ”€β”€ subject003
└── ……
lesion/
β”œβ”€β”€ subject01/
β”‚   β”œβ”€β”€ brain.nii.gz            # brain image
β”‚   β”œβ”€β”€ lesion.nii.gz           # lesion map GT
β”‚   β”œβ”€β”€ Flair-brain.nii.gz      # T2-FLAIR modality image, if have
β”‚   β”œβ”€β”€ T2-brain.nii.gz         # T2 modality image, if have
β”‚   β”œβ”€β”€ ……                      # other modality image, if have
β”œβ”€β”€ subject02 
β”œβ”€β”€ subject03
└── ……

πŸš€ Step 3: Train B-CLIP

Now you can start training B-CLIP. You can choose to train from scratch or load our pre-trained model of B-CLIP for fine-tuning. You can download our pretrained B-CLIP model through the following link: BCLIP

python /BCLIP/train.py  # Please change the path in the code to the path of your own data

Get started with BrainSeg

πŸ“‚ Step 1: Data preprocessing

Before starting training, you should preprocess your data following the same steps as ours, including performing bias field correction and skull stripping, registering all images to the space consistent with our training data, reorientation to a consistent RPI coordinate system, and cropping the images to (224, 256, 224). We provide a preprocessing script to facilitate these steps. After preprocessing, your data directory should be structured to match the B-CLIP training format.

πŸš€ Step 2: Train BrainSeg

Now you can start training BrainSeg. You can choose to train from scratch or load our pre-trained model of BrainSeg for fine-tuning. You can download our pretrained BrainSeg model through the following link: BrainSeg_tissue for tissue segmentation, BrainSeg_parc for brain parcellation and BrainSeg_lesion for lesion labeling

python train.py  # Please change the path in the code to the path of your own data

Step 3: Inference using our pretrained model

We provide a set of example samples covering diverse age groups, multiple modalities, and lesion cases in Sample and a default text metadata prompt for the sample data in test.xlsx. You can run inference directly on these samples using our pre-trained model. You can also test on your own data, provided it is structured as follows:

test.xlsx                   # text metadata prompt
Sample/
β”œβ”€β”€ sub001/
β”‚   β”œβ”€β”€ brain.nii.gz        
β”‚   β”œβ”€β”€ tissue.nii.gz        
β”‚   β”œβ”€β”€ dk-struct.nii.gz     
β”‚   β”œβ”€β”€ T2-brain.nii.gz            
β”‚   β”œβ”€β”€ ……                  # any other modalities, if have                       
β”œβ”€β”€ sub002/
β”‚   β”œβ”€β”€ brain.nii.gz        
β”‚   β”œβ”€β”€ tissue.nii.gz        
β”‚   β”œβ”€β”€ dk-struct.nii.gz       
β”‚   β”œβ”€β”€ CT-brain.nii.gz        
β”‚   β”œβ”€β”€ ……                  # any other modalities, if have
β”œβ”€β”€ sub003
└── ……

1. Tissue segmentation

You can run tissue segmentation inference using the following command:

python inference.py \
    --texts_path ./test.xlsx \
    --images_path ./Sample \
    --predir ./Sample \
    --model_dir /your/path/for/BrainSeg_model \
    --clip_dir /your/path/for/BCLIP \
    --img_size 224 256 224 \
    --in_channels 6 \
    --out_channels 4 \
    --mode tissue \
    --flag multi
    • --texts_path: Path to the Excel file (.xlsx) containing text prompts (Default: ./test.xlsx)
    • --images_path: Directory path of the input images to be processed (Default: ./Sample)
    • --img_size: Input image size as a tuple (D, H, W) (Default: 224 256 224)
    • --in_channels: Number of input channels (Default: 6)
    • --out_channels: Number of output classes (Default: 4)
    • --device: Device to run the model on, e.g., cuda for GPU or cpu (Default: cuda)
    • --model_dir: Path to the directory containing your pretrained BrainSeg model
    • --clip_dir: Path to the directory containing your pretrained B-CLIP model
    • --predir: Directory path where the output results will be saved (Default: ./Sample)
    • --mode: Specifies the prediction task (e.g., tissue, dk or lesion).
    • --flag: Determines the input modality strategy (Default: multi)
      • multi: Uses all modalities in the subject folder for fusion inference.
      • single: Uses a single modality for inference. Must be used with --modality.
    • --modality: Specifies the target modality filename when --flag is set to single (Default: CT-brain.nii.gz)

2. Brain Parcellation

You can run brain parcellation inference using the following command:

python inference.py \
    --texts_path ./test.xlsx \
    --images_path ./Sample \
    --predir ./Sample \
    --model_dir /your/path/for/BrainSeg_model \
    --clip_dir /your/path/for/BCLIP \
    --img_size 224 256 224 \
    --in_channels 7 \
    --out_channels 107 \
    --mode dk

3. Lesion labeling

You can run lesion labeling inference using the following command:

python inference.py \
    --texts_path ./test.xlsx \
    --images_path ./Sample \
    --predir ./Sample \
    --model_dir /your/path/for/BrainSeg_model \
    --clip_dir /your/path/for/BCLIP \
    --img_size 224 256 224 \
    --in_channels 6 \
    --out_channels 4 \
    --mode lesion

πŸ“– Citation

If you find this work useful in your research, please cite:

Shijie Huang†, Zifeng Lian†, Dengqiang Jia†, Kaicong Sun†, Xiaoye Li†, Jiameng Liu†, Yulin Wang, Caiwen Jiang, Fangmei Zhu, Zhongxiang Ding*, Han Zhang*, Geng Chen*, Feng Shi*, Dinggang Shen*. BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling. (Under Review)

Copyright IDEA Lab, School of Biomedical Engineering, ShanghaiTech University, Shanghai, China.

Licensed under the the GPL (General Public License);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Repo for BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling
Contact: huangshj@shanghaitech.edu.cn
         lianzf2024@shanghaitech.edu.cn

About

Official implementation code for BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages