🧠 BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling

Official implementation code for BrainSeg. We propose a novel AI-based tool for comprehensive brain imaging segmentation with generalizability across multiple modalities, including MRI, CT, PET, and ultrasound, as well as across the lifespan (from fetuses to the elderly). This framework consists of three main components: B-Syn, B-CLIP, and BrainSeg.

Model overview

Results

🛠️ Installation

To ensure a clean workspace and prevent dependency conflicts, we strongly recommend creating a new Conda environment before running the code.

1. Create and Activate Environment

# Create a new conda environment named 'BrainSeg' with Python 3.9 and install the required libraries
cd /your/path/to/this/repository
conda env create -f environment.yml -n BrainSeg

# Activate the environment
conda activate BrainSeg

Get started with B-Syn

We provide a demo script for immediate testing and usage of our B-Syn module.

🧪 Quick Start

First, navigate to the B-Syn source directory:

cd ./BSyn/BSyn/

1. Multimodal Synthesis

You can synthesize images for different modalities by specifying the target output filename. Please refer to our function arguments in the code for a full list of supported modality parameters. For example, to generate a T2-weighted MRI image:

python BSyn_Demo.py --modality T2-brain.nii.gz

2. Lesion Synthesis

B-Syn supports the simulation of pathological features, such as tumors and strokes. You can control the pathology type using the --lesion_type argument.

Generate Tumor data

python BSyn_Demo.py --modality Flair-brain.nii.gz --lesion_type tumor

Generate Stroke data

python BSyn_Demo.py --modality DWI-brain.nii.gz --lesion_type stroke

Get started with B-CLIP

⚙️ Step 1: Set up the environment for BiomedCLIP

Our B-CLIP fine-tunes BiomedCLIP text encoder based on LoRA, so you need to first configure the Biomedical environment:

1. First clone the latest BiomedCLIP model (the commit version we used is 27005c2, and earlier versions may have compatibility issues)

mkdir BiomedCLIP
cd ./BiomedCLIP
git clone https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224

2. And then clone the latest BiomedBERT-abstract

mkdir BiomedBERT-abstract
cd ./BiomedBERT-abstract
git clone https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract

3. If you encounter network issues when running git clone, we also provide the already downloaded folders for convenience through the following links: BiomedCLIP and BiomedBERT-abstract.

4. To invoke your own local path of Biomedical, you need to make a little modification to the source code of open-clip. Please follow: mlfoundations/open_clip#772 (comment)

5. Finally modify the model configuration to enable the text encoder to output tokens. In /your/path/to/BiomedCLIP/open_clip_config.json, add the setting of output_tokens to the "text_cfg" dictionary

Before:

   ...
   "context_length": 256
}

After:

    ...
    "context_length": 256,
    "output_tokens": true
}

6. Now you can load the Biomedical model like this:

model, preprocess = open_clip.create_model_from_pretrained(
    'hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224',
    pretrained='/your/path/to/BiomedCLIP/open_clip_pytorch_model.bin',
    cache_dir='/your/path/to/BiomedCLIP')
tokenizer = open_clip.get_tokenizer(
    model_name='hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224',
    cache_dir='/your/path/to/BiomedCLIP')

📂 Step 2: Prepare your data to train B-CLIP

You can organize your file directory as follows to train B-CLIP on your own data

prompt.xlsx                     # excel for text metadata
parameter.xlsx                  # excel for image parameter, used for data synthetic of B-Syn module
lesion_parameter.xlsx           # excel for lesion parameter, used for lesion synthetic of B-Syn module
data/
├── subject001/
│   ├── brain.nii.gz            # brain image
│   ├── tissue.nii.gz           # tissue map GT
│   ├── dk-struct.nii.gz        # roi map GT
│   ├── T2-brain.nii.gz         # T2 modality image, if have
│   ├── CT-brain.nii.gz         # CT modality image, if have
│   ├── ……                      # other modality image, if have
├── subject002 
├── subject003
└── ……
lesion/
├── subject01/
│   ├── brain.nii.gz            # brain image
│   ├── lesion.nii.gz           # lesion map GT
│   ├── Flair-brain.nii.gz      # T2-FLAIR modality image, if have
│   ├── T2-brain.nii.gz         # T2 modality image, if have
│   ├── ……                      # other modality image, if have
├── subject02 
├── subject03
└── ……

🚀 Step 3: Train B-CLIP

Now you can start training B-CLIP. You can choose to train from scratch or load our pre-trained model of B-CLIP for fine-tuning. You can download our pretrained B-CLIP model through the following link: BCLIP

python /BCLIP/train.py  # Please change the path in the code to the path of your own data

Get started with BrainSeg

📂 Step 1: Data preprocessing

Before starting training, you should preprocess your data following the same steps as ours, including performing bias field correction and skull stripping, registering all images to the space consistent with our training data, reorientation to a consistent RPI coordinate system, and cropping the images to (224, 256, 224). We provide a preprocessing script to facilitate these steps. After preprocessing, your data directory should be structured to match the B-CLIP training format.

🚀 Step 2: Train BrainSeg

Now you can start training BrainSeg. You can choose to train from scratch or load our pre-trained model of BrainSeg for fine-tuning. You can download our pretrained BrainSeg model through the following link: BrainSeg_tissue for tissue segmentation, BrainSeg_parc for brain parcellation and BrainSeg_lesion for lesion labeling

python train.py  # Please change the path in the code to the path of your own data

Step 3: Inference using our pretrained model

We provide a set of example samples covering diverse age groups, multiple modalities, and lesion cases in Sample and a default text metadata prompt for the sample data in test.xlsx. You can run inference directly on these samples using our pre-trained model. You can also test on your own data, provided it is structured as follows:

test.xlsx                   # text metadata prompt
Sample/
├── sub001/
│   ├── brain.nii.gz        
│   ├── tissue.nii.gz        
│   ├── dk-struct.nii.gz     
│   ├── T2-brain.nii.gz            
│   ├── ……                  # any other modalities, if have                       
├── sub002/
│   ├── brain.nii.gz        
│   ├── tissue.nii.gz        
│   ├── dk-struct.nii.gz       
│   ├── CT-brain.nii.gz        
│   ├── ……                  # any other modalities, if have
├── sub003
└── ……

1. Tissue segmentation

You can run tissue segmentation inference using the following command:

python inference.py \
    --texts_path ./test.xlsx \
    --images_path ./Sample \
    --predir ./Sample \
    --model_dir /your/path/for/BrainSeg_model \
    --clip_dir /your/path/for/BCLIP \
    --img_size 224 256 224 \
    --in_channels 6 \
    --out_channels 4 \
    --mode tissue \
    --flag multi

- --texts_path: Path to the Excel file (.xlsx) containing text prompts (Default: ./test.xlsx)
- --images_path: Directory path of the input images to be processed (Default: ./Sample)
- --img_size: Input image size as a tuple (D, H, W) (Default: 224 256 224)
- --in_channels: Number of input channels (Default: 6)
- --out_channels: Number of output classes (Default: 4)
- --device: Device to run the model on, e.g., cuda for GPU or cpu (Default: cuda)
- --model_dir: Path to the directory containing your pretrained BrainSeg model
- --clip_dir: Path to the directory containing your pretrained B-CLIP model
- --predir: Directory path where the output results will be saved (Default: ./Sample)
- --mode: Specifies the prediction task (e.g., tissue, dk or lesion).
- --flag: Determines the input modality strategy (Default: multi)
  - multi: Uses all modalities in the subject folder for fusion inference.
  - single: Uses a single modality for inference. Must be used with --modality.
- --modality: Specifies the target modality filename when --flag is set to single (Default: CT-brain.nii.gz)

2. Brain Parcellation

You can run brain parcellation inference using the following command:

python inference.py \
    --texts_path ./test.xlsx \
    --images_path ./Sample \
    --predir ./Sample \
    --model_dir /your/path/for/BrainSeg_model \
    --clip_dir /your/path/for/BCLIP \
    --img_size 224 256 224 \
    --in_channels 7 \
    --out_channels 107 \
    --mode dk

3. Lesion labeling

You can run lesion labeling inference using the following command:

python inference.py \
    --texts_path ./test.xlsx \
    --images_path ./Sample \
    --predir ./Sample \
    --model_dir /your/path/for/BrainSeg_model \
    --clip_dir /your/path/for/BCLIP \
    --img_size 224 256 224 \
    --in_channels 6 \
    --out_channels 4 \
    --mode lesion

📖 Citation

If you find this work useful in your research, please cite:

Shijie Huang^†, Zifeng Lian^†, Dengqiang Jia^†, Kaicong Sun^†, Xiaoye Li^†, Jiameng Liu^†, Yulin Wang, Caiwen Jiang, Fangmei Zhu, Zhongxiang Ding^*, Han Zhang^*, Geng Chen^*, Feng Shi^*, Dinggang Shen^*. BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling. (Under Review)

License

Copyright IDEA Lab, School of Biomedical Engineering, ShanghaiTech University, Shanghai, China.

Licensed under the the GPL (General Public License);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Repo for BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling
Contact: huangshj@shanghaitech.edu.cn
         lianzf2024@shanghaitech.edu.cn

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
BCLIP		BCLIP
BSyn		BSyn
Sample		Sample
figures		figures
.gitignore		.gitignore
BrainSeg.py		BrainSeg.py
Inference.py		Inference.py
LICENSE		LICENSE
MNI152_T1_1mm_Brain.nii.gz		MNI152_T1_1mm_Brain.nii.gz
README.md		README.md
cross_att.py		cross_att.py
dataloader.py		dataloader.py
environment.yml		environment.yml
loss.py		loss.py
preprocessing.py		preprocessing.py
test.xlsx		test.xlsx
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling

Model overview

Results

🛠️ Installation

1. Create and Activate Environment

Get started with B-Syn

🧪 Quick Start

Get started with B-CLIP

⚙️ Step 1: Set up the environment for BiomedCLIP

📂 Step 2: Prepare your data to train B-CLIP

🚀 Step 3: Train B-CLIP

Get started with BrainSeg

📂 Step 1: Data preprocessing

🚀 Step 2: Train BrainSeg

Step 3: Inference using our pretrained model

1. Tissue segmentation

2. Brain Parcellation

3. Lesion labeling

📖 Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling

Model overview

Results

🛠️ Installation

1. Create and Activate Environment

Get started with B-Syn

🧪 Quick Start

Get started with B-CLIP

⚙️ Step 1: Set up the environment for BiomedCLIP

📂 Step 2: Prepare your data to train B-CLIP

🚀 Step 3: Train B-CLIP

Get started with BrainSeg

📂 Step 1: Data preprocessing

🚀 Step 2: Train BrainSeg

Step 3: Inference using our pretrained model

1. Tissue segmentation

2. Brain Parcellation

3. Lesion labeling

📖 Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages