Multimodal Co-Attention Transformer for Survival Prediction in Gigapixel Whole Slide Images - ICCV 2021
-
Updated
Mar 11, 2022 - Jupyter Notebook
Multimodal Co-Attention Transformer for Survival Prediction in Gigapixel Whole Slide Images - ICCV 2021
Visual Fusion of Camera and LiDAR Sensor
This study introduces MultiBanFakeDetect, a novel multimodal dataset for Bangla fake news detection, combining textual and visual information. It features TextFakeNet for text analysis and MultiFusionFake for integrating multimodal data.
Early Fusion, Late Fusion, and Hybrid Fusion
Lightweight multi-frame integration for YOLO (ECMR 2025 paper)
Streamlit app for demonstrating multi-modal(vision+language) modelling in Pytorch.
Multimodal AI from scratch: RGB + LiDAR sensor fusion, CLIP-style contrastive pre-training, and cross-modal projection using PyTorch. / RGB 카메라와 LiDAR 센서 데이터를 활용한 멀티모달 AI 구현 — 조기·후기·중간 융합 비교, CLIP 스타일 대조학습, 크로스모달 프로젝션.
🌟 Enhance YOLOv7 with multi-frame detection for improved robustness against blur and occlusion, using efficient weak supervision and minimal model changes.
Low-resource multimodal hate speech detection leveraging acoustic and textual representations for robust moderation in Telugu.
Early-fusion multimodal machine learning for emotion classification from social media videos (visual, audio, text). Portfolio project from SATRIA DATA 2025
Add a description, image, and links to the early-fusion topic page so that developers can more easily learn about it.
To associate your repository with the early-fusion topic, visit your repo's landing page and select "manage topics."