: Aligned Multi-View Embeddings for Brain MRI Analysis

Official implementation and foundation model weights for , a vision-language pre-training framework leveraging a large-scale dataset of paired brain MRIs and clinical reports (~80,000 sessions). Our models provide a powerful starting point for downstream clinical tasks like report generation, classification, and segmentation.

Read the Paper | BibTeX

Model Zoo

We provide several foundation models trained on T1 post-contrast (T1c) scans and a high-performing multimodal variant. All weights consist of a vision backbone connected to a QFormer-like architecture to obtain the multi-view embeddings. It is possible to use either the full model (vision backbone + multi-view embeddings) or just the vision backbone for feature extraction.

Model Name	Input Modalities	Vision Backbone	Weights
-DenseNet121	T1c	DenseNet-121	Download
-ViT-Base	T1c	ViT-B/16	Download
-ResNet50	T1c	ResNet-50	Download
-Multimodal	T1c, T1, T2, FLAIR	DenseNet-169	Download

Note (a): The ViT model provided here is an updated version that outperforms the variant originally reported in the paper, matching the performance of our ResNet foundation model.
Note (b): The -Multimodal is not reported in the paper and is our strongest model yet.

Usage

Install the required packages using pip:

pip install -r requirements.txt

Feature Extraction

The extract_features.py script contains the necessary code to load the models and generate embeddings for a toy input volume.

Full Embedding (Vision backbone + multi-view embeddings):

python extract_features.py \
  --weights /path/to/weights.bin \
  --vision-model-name densenet121 \
  --in-channels 1 \
  --mode full

Vision backbone only:

python extract_features.py \
  --weights /path/to/weights.bin \
  --vision-model-name vit \
  --in-channels 1 \
  --mode vision

Preprocessing Pipeline

For the Multimodal model, modalities must be stacked in the channel dimension ($C=4$) in the following order: T1 post-contrast (T1c); T1; T2; FLAIR. In our dataset, a non-negligible portion of studies are missing one or more of these sequences. In such cases, missing modalities were zero-filled to maintain the 4-channel input structure.

Below is the MONAI-based preprocessing pipeline we used for our pre-training runs. Images are expected to be in NIfTI format (.nii or .nii.gz).

- LoadImaged: {}
- EnsureChannelFirstd: {channel_dim: 'no_channel'}
- Spacingd: {pixdim: [1, 1, 1], mode: 'bilinear'}
- Orientationd: {axcodes: 'SAR'}
- Resized: {spatial_size: [32, 256, 256]}
- NormalizeIntensityd: {channel_wise: true, nonzero: true}
- ScaleIntensityd: {channel_wise: true, maxv: 1.0}

Citation

If you make use of our models, please consider citing us at:

@inproceedings{kayser2026brat,
  title     = {brat: Aligned Multi-View Embeddings for Brain MRI Analysis},
  author    = {Kayser, Maxime and Gridnev, Maksim and Wang, Wanting and Bain, Max and Rangnekar, Aneesh and Chatterjee, Avijit and Petrov, Aleksandr and Veeraraghavan, Harini and Swinburne, Nathaniel C.},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year      = {2026},
}

License

This project and the accompanying model weights are licensed under the Creative Commons Attribution-Non Commercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License.

Academic/Research Use: Encouraged and permitted.
Commercial Use: Prohibited.

For commercial licensing inquiries or if you are unsure if your use case qualifies as non-commercial, please open an issue or contact the maintainers directly.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
figs		figs
LICENSE		LICENSE
README.md		README.md
extract_features.py		extract_features.py
model.py		model.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

: Aligned Multi-View Embeddings for Brain MRI Analysis

Model Zoo

Usage

Feature Extraction

Preprocessing Pipeline

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

: Aligned Multi-View Embeddings for Brain MRI Analysis

Model Zoo

Usage

Feature Extraction

Preprocessing Pipeline

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages