A Multimodal Deep Learning Framework for Skin Cancer Detection and 3D Skin Reconstruction
AI & Data
Semester programme:Master of Applied IT
Client company:Bergman Clinics
Project group members:Ivan Bekriev
Cesar van Leuffelen
Project description
The project consists of two main parts- combining multiple images together for better AI detection of skin cancer and creating a 3D reconstruction of skin.
The two main research questions are:
- How does combining images from the same patient influence the accuracy of the AI solution?
- Which technique for 3D reconstruction yields the best results within Mohs surgery ‘s constrained environment?
Context
Basal cell carcinoma (BCC) is the most common form of non-melanoma skin cancer and its incidence is increasing worldwide. One of the most widely used treatments for BCC is Mohs micrographic surgery, in which the surgeon removes tissue surrounding the tumor to ensure complete excision while preserving as much healthy tissue as possible. However, Mohs surgery is a time-consuming procedure as multiple layers of skin may need to be removed and examined one by one. This analysis can take several days and places a significant burden on the healthcare system. This project investigates how modern technologies can be used to improve the efficiency of this process.
The first part of the project explores whether combining information from multiple skin slices can enhance the performance of AI models in detecting skin cancer. The second part focuses on reconstructing the skin from these slices to provide a more accurate representation of the tumor’s location.
Results
The results of the AI part of the project showed that the two methods used called intermediate fusion and output fusion, achieved strong performance with accuracies of 85% and 97% respectively, in distinguishing cancer from non-cancer tissue. The models were further evaluated using unseen data from 10 patients where both approaches correctly classified all cases. However, these results should be interpreted with caution as certain limitations in the dataset prevented a more comprehensive evaluation.
In regard to the 3D reconstruction, several techniques were applied - interpolation, stacking and NeRF. Expert evaluation showed interpolation and stacking outperformed NeRF across all metrics (expert scores 3.5-4.5 vs 1.5-2.5, faster processing). However, hold-out testing revealed interpolation doesn't create new biological information, just bridges gaps (SSIM 0.24-0.34). NeRF currently underperforms but could potentially learn inherent tissue structures with sufficient resources to become a general model, making it promising for future development despite current limitations.