Skin cancer skin tone balancing using artificial images
AI & Data
Semester programme:Artificial Intelligence
Research group:Sustainable Data & AI Application
Project group members:Sven Simons
George Holynski
Vanesa Taneva
Mirco La Ferrara
Project description
The project is to create synthetic images depicting skin cancer on mainly darker skintones. There is a large discrepancy between the amount of images available between lighter and darker skintones. This project aims to gap this bridge by using synthetically created images to augment already existing datasets.
Context
The main context of the project would be healthcare, generally people with a ligher skintone get skin cancer more easily than people with a darker skintone. Because of this there exists a larger dataset for lighter skintones compared to darker skintones. These datasets have already been fed to available models to help doctors diagnose patients, these models however are worse at diagnosing people with a darker skintone, since they have less data to train on.
Results
Our most important outcome would be that it's very difficult to get a well functioning model, since while there is a large amount of data the amount of darker skintoned data is very limited (~ 4,000 images out of a ~440,000 image dataset). And since skin cancer is not a pillar, but has different forms this data should be split into each of the forms before training/generation. However this would lead to an extreme lack of data for certain types of skin cancer. So for this project we focussed on the benign/malignant split. This split did manage to generate decently realistic images (at least to the human eye), but this has not been validated through the use of a pretrained model.
Future work should therefore focus on those two points mentioned above: The split in cancer image types and the validation through the use of an already existing model.
About the project group
We all are students at Fontys IT, we worked on this project for one semester (Sept - Jan). We worked by using standups and standdowns twice a week, during this time new work would also get destributed, or you could indicate if you're struggling with any specific task. We found this worked best due to the limited time we were allowed within the TQ building.