π§ BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling
Official implementation code for BrainSeg. We propose a novel AI-based tool for comprehensive brain imaging segmentation with generalizability across multiple modalities, including MRI, CT, PET, and ultrasound, as well as across the lifespan (from fetuses to the elderly). This framework consists of three main components: B-Syn, B-CLIP, and BrainSeg.
To ensure a clean workspace and prevent dependency conflicts, we strongly recommend creating a new Conda environment before running the code.
# Create a new conda environment named 'BrainSeg' with Python 3.9 and install the required libraries
cd /your/path/to/this/repository
conda env create -f environment.yml -n BrainSeg
# Activate the environment
conda activate BrainSegWe provide a demo script for immediate testing and usage of our B-Syn module.
First, navigate to the B-Syn source directory:
cd ./BSyn/BSyn/1. Multimodal Synthesis
You can synthesize images for different modalities by specifying the target output filename. Please refer to our function arguments in the code for a full list of supported modality parameters. For example, to generate a T2-weighted MRI image:
python BSyn_Demo.py --modality T2-brain.nii.gz2. Lesion Synthesis
B-Syn supports the simulation of pathological features, such as tumors and strokes. You can control the pathology type using the --lesion_type argument.
Generate Tumor data
python BSyn_Demo.py --modality Flair-brain.nii.gz --lesion_type tumorGenerate Stroke data
python BSyn_Demo.py --modality DWI-brain.nii.gz --lesion_type strokeOur B-CLIP fine-tunes BiomedCLIP text encoder based on LoRA, so you need to first configure the Biomedical environment:
1. First clone the latest BiomedCLIP model (the commit version we used is 27005c2, and earlier versions may have compatibility issues)
mkdir BiomedCLIP
cd ./BiomedCLIP
git clone https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_2242. And then clone the latest BiomedBERT-abstract
mkdir BiomedBERT-abstract
cd ./BiomedBERT-abstract
git clone https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract3. If you encounter network issues when running git clone, we also provide the already downloaded folders for convenience through the following links: BiomedCLIP and BiomedBERT-abstract.
4. To invoke your own local path of Biomedical, you need to make a little modification to the source code of open-clip. Please follow: mlfoundations/open_clip#772 (comment)
5. Finally modify the model configuration to enable the text encoder to output tokens. In /your/path/to/BiomedCLIP/open_clip_config.json, add the setting of output_tokens to the "text_cfg" dictionary
Before:
...
"context_length": 256
}After:
...
"context_length": 256,
"output_tokens": true
}6. Now you can load the Biomedical model like this:
model, preprocess = open_clip.create_model_from_pretrained(
'hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224',
pretrained='/your/path/to/BiomedCLIP/open_clip_pytorch_model.bin',
cache_dir='/your/path/to/BiomedCLIP')
tokenizer = open_clip.get_tokenizer(
model_name='hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224',
cache_dir='/your/path/to/BiomedCLIP')You can organize your file directory as follows to train B-CLIP on your own data
prompt.xlsx # excel for text metadata
parameter.xlsx # excel for image parameter, used for data synthetic of B-Syn module
lesion_parameter.xlsx # excel for lesion parameter, used for lesion synthetic of B-Syn module
data/
βββ subject001/
β βββ brain.nii.gz # brain image
β βββ tissue.nii.gz # tissue map GT
β βββ dk-struct.nii.gz # roi map GT
β βββ T2-brain.nii.gz # T2 modality image, if have
β βββ CT-brain.nii.gz # CT modality image, if have
β βββ β¦β¦ # other modality image, if have
βββ subject002
βββ subject003
βββ β¦β¦
lesion/
βββ subject01/
β βββ brain.nii.gz # brain image
β βββ lesion.nii.gz # lesion map GT
β βββ Flair-brain.nii.gz # T2-FLAIR modality image, if have
β βββ T2-brain.nii.gz # T2 modality image, if have
β βββ β¦β¦ # other modality image, if have
βββ subject02
βββ subject03
βββ β¦β¦Now you can start training B-CLIP. You can choose to train from scratch or load our pre-trained model of B-CLIP for fine-tuning. You can download our pretrained B-CLIP model through the following link: BCLIP
python /BCLIP/train.py # Please change the path in the code to the path of your own dataBefore starting training, you should preprocess your data following the same steps as ours, including performing bias field correction and skull stripping, registering all images to the space consistent with our training data, reorientation to a consistent RPI coordinate system, and cropping the images to (224, 256, 224). We provide a preprocessing script to facilitate these steps. After preprocessing, your data directory should be structured to match the B-CLIP training format.
Now you can start training BrainSeg. You can choose to train from scratch or load our pre-trained model of BrainSeg for fine-tuning. You can download our pretrained BrainSeg model through the following link: BrainSeg_tissue for tissue segmentation, BrainSeg_parc for brain parcellation and BrainSeg_lesion for lesion labeling
python train.py # Please change the path in the code to the path of your own dataWe provide a set of example samples covering diverse age groups, multiple modalities, and lesion cases in Sample and a default text metadata prompt for the sample data in test.xlsx. You can run inference directly on these samples using our pre-trained model. You can also test on your own data, provided it is structured as follows:
test.xlsx # text metadata prompt
Sample/
βββ sub001/
β βββ brain.nii.gz
β βββ tissue.nii.gz
β βββ dk-struct.nii.gz
β βββ T2-brain.nii.gz
β βββ β¦β¦ # any other modalities, if have
βββ sub002/
β βββ brain.nii.gz
β βββ tissue.nii.gz
β βββ dk-struct.nii.gz
β βββ CT-brain.nii.gz
β βββ β¦β¦ # any other modalities, if have
βββ sub003
βββ β¦β¦You can run tissue segmentation inference using the following command:
python inference.py \
--texts_path ./test.xlsx \
--images_path ./Sample \
--predir ./Sample \
--model_dir /your/path/for/BrainSeg_model \
--clip_dir /your/path/for/BCLIP \
--img_size 224 256 224 \
--in_channels 6 \
--out_channels 4 \
--mode tissue \
--flag multi-
--texts_path: Path to the Excel file (.xlsx) containing text prompts (Default:./test.xlsx)--images_path: Directory path of the input images to be processed (Default:./Sample)--img_size: Input image size as a tuple(D, H, W)(Default:224 256 224)--in_channels: Number of input channels (Default:6)--out_channels: Number of output classes (Default:4)--device: Device to run the model on, e.g.,cudafor GPU orcpu(Default:cuda)--model_dir: Path to the directory containing your pretrained BrainSeg model--clip_dir: Path to the directory containing your pretrained B-CLIP model--predir: Directory path where the output results will be saved (Default:./Sample)--mode: Specifies the prediction task (e.g.,tissue,dkorlesion).--flag: Determines the input modality strategy (Default:multi)multi: Uses all modalities in the subject folder for fusion inference.single: Uses a single modality for inference. Must be used with--modality.
--modality: Specifies the target modality filename when--flagis set tosingle(Default:CT-brain.nii.gz)
You can run brain parcellation inference using the following command:
python inference.py \
--texts_path ./test.xlsx \
--images_path ./Sample \
--predir ./Sample \
--model_dir /your/path/for/BrainSeg_model \
--clip_dir /your/path/for/BCLIP \
--img_size 224 256 224 \
--in_channels 7 \
--out_channels 107 \
--mode dkYou can run lesion labeling inference using the following command:
python inference.py \
--texts_path ./test.xlsx \
--images_path ./Sample \
--predir ./Sample \
--model_dir /your/path/for/BrainSeg_model \
--clip_dir /your/path/for/BCLIP \
--img_size 224 256 224 \
--in_channels 6 \
--out_channels 4 \
--mode lesionIf you find this work useful in your research, please cite:
Shijie Huangβ , Zifeng Lianβ , Dengqiang Jiaβ , Kaicong Sunβ , Xiaoye Liβ , Jiameng Liuβ , Yulin Wang, Caiwen Jiang, Fangmei Zhu, Zhongxiang Ding*, Han Zhang*, Geng Chen*, Feng Shi*, Dinggang Shen*. BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling. (Under Review)
Copyright IDEA Lab, School of Biomedical Engineering, ShanghaiTech University, Shanghai, China.
Licensed under the the GPL (General Public License);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Repo for BrainSeg: A Generalized Framework for Comprehensive Multimodal Brain Tissue Segmentation, Parcellation, and Lesion Labeling
Contact: huangshj@shanghaitech.edu.cn
lianzf2024@shanghaitech.edu.cn
