INTEGRATION OF DEEP LEARNING MODELS AND VISUAL ATTENTION MECHANISMS FOR LUNG DISEASE CLASSIFICATION FROM CHEST X-RAY IMAGE

Introduction

Chest X-ray (CXR) is a common diagnostic method for chest issues, using a small amount of ionizing radiation to create internal images. It helps doctors identify and monitor lung diseases, but careful diagnosis requires expertise and time, which can lead to errors when many CXRs must be reviewed daily. Research suggests deep learning models can improve diagnostic accuracy. This study focuses on detecting key areas in CXRs for common lung diseases like tuberculosis, pneumonia, mass, and effusion. The results aim to develop automated tools to reduce errors and enhance diagnostic efficiency.

Figure 1: Overview diagram of the research method.

Dataset

Source: Tuberculosis Chest Xray Cleaned Dataset, Tuberculosis classification, NIH Chest Xray dataset, CXR dataset, VinDr CXR Normal
From: Kaggle

Condition	Tuberculosis	Pneumonia	Mass	Effusion	Total
Cases	7089	2773	3076	1539	2898

Figure 2: Distributions of the datasets used.

Research Method

Occlusion-based Method: This method involves applying occlusion techniques to different regions of an X-ray image to assess the importance of these areas for disease classification and to observe their impact on the model's predictions. Figure 3 illustrates 9 occluded information regions on the X-ray for experimental purposes. Deep learning models InceptionV3, MobileNet, VGG16, and YOLOv8 are used to predict disease types.

Figure 3: 9 steps to jump to a glimpse on a chest X-ray image.
RAM Attention Mechanism Model: This model implements RAM attention mechanisms to predict important regions in X-ray images during classification. It evaluates the model's ability to improve accuracy and classification speed by focusing on critical information areas.
Deep Learning Classification Model: Two of the highest-performing models from the experiments, YOLOv8 and DenseNet121, are trained to predict disease labels or "normal" using X-ray image data.

Result

Model	Accuracy	Precision	Recall	F1-score
YOLOv8 (516*516)	89.0934	87.8825	83.9925	85.3262
Densenet201 (516*516)	79.3532	73.4156	72.2743	72.5185
YOLOv8 (256*256)	80.8214	75.3052	73.4034	73.6246
Densenet201 (256*256)	77.3263	73.8462	74.7169	73.9753
RAM	47.4672	37.4673	36.5757	36.4369

Grad-CAM (Gradient-weighted Class Activation Mapping) is a powerful visualization technique that helps understand and interpret the decision-making process of deep learning models, including YOLOv8. By applying Grad-CAM, we can identify specific areas in an image that the model focuses on to make predictions. YOLOv8 divides the image into four regions for analysis and prediction, with each area being crucial for identifying objects and features. Grad-CAM visualizes these focused regions, clarifying how the model processes information and makes decisions. This understanding aids in improving and fine-tuning the model for better performance. As illustrated in Figure 4,5,6, 7 and 8, Grad-CAM reveals the essential areas YOLOv8 uses in object analysis and recognition, providing insights into the workings of deep learning models and laying a solid foundation for future research and applications in image recognition and computer vision.

Figure 4: Image analysis of Effusion disease from the YOLOv8 model using Grad-CAM.
Figure 5: Image analysis of Mass disease from the YOLOv8 model using Grad-CAM.
Figure 6: Image analysis of Normal status from the YOLOv8 model using Grad-CAM.
Figure 7: Image analysis of Pneumonia disease from the YOLOv8 model using Grad-CAM.
Figure 8: Image analysis of Tuberculosis disease from the YOLOv8 model using Grad-CAM.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
model		model
recurrent-visual-attention		recurrent-visual-attention
result		result
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

INTEGRATION OF DEEP LEARNING MODELS AND VISUAL ATTENTION MECHANISMS FOR LUNG DISEASE CLASSIFICATION FROM CHEST X-RAY IMAGE

Introduction

Dataset

Research Method

Result

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

INTEGRATION OF DEEP LEARNING MODELS AND VISUAL ATTENTION MECHANISMS FOR LUNG DISEASE CLASSIFICATION FROM CHEST X-RAY IMAGE

Introduction

Dataset

Research Method

Result

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages