A huge problem in clinical AI is the availability of labeled data for model training. The majority of recently introduced machine learning models in healthcare use supervised learning which requires large datasets that have been labeled by medical experts for model training.
However, manual labeling is time consuming as well as cost-ineffective and thus, large datasets with high-quality labels are rare. Nevertheless, medical diagnostics routinely generate large amounts of data, especially in radiology and pathology.
This disparity between overall available data and available labeled data currently hinders the large-scale development of diagnostic AI with supervised learning.
A less well-known branch of AI (at least in healthcare), semi-supervised learning, can use this abundance of unlabeled data to develop robust models with only a fraction of labeled data needed. Therefore, semi-supervised learning can be a viable tool in the training and implementation of diagnostic AI tools.
If you want to find out more, check out our review on semi-supervised learning for cancer diagnostics.
Citation: Eckardt J-N, Bornhäuser M, Wendt K and Middeke JM (2022) Semi-supervised learning in cancer diagnostics. Front. Oncol. 12:960984. doi: 10.3389/fonc.2022.960984