Abstract—2D to 3D image registration has a vital role in medical imaging and remains a significant challenge. It primarily relates to the use and analysis of multimodal data. We address the issue by developing a multimodal machine learning algorithm that predicts the position of a 2D slice in a 3D biomedical atlas dataset based on textual annotation and image data. Our algorithm first separately analyses images and textual information using base models and then combines the outputs of the base models using a Meta-learner model. To evaluate learning models, we have built a custom accuracy function. We tested different variants of Convolutional Neural Network architectures and different transfer learning techniques to build an optimal image base model for image analysis. To analyze textual information, we used tree-based ensemble models, namely, Random Forest and XGBoost algorithms. We applied the grid search to find optimal hyperparameters for tree-based methods. We have found that the XGBoost model showed the best performance in combining predictions from different base models. Testing the developed method showed 99.55% accuracy in predicting 2D slice position in a 3D atlas model.
Index Terms—image registration, multimodal data, EMAP atlas. CNN, deep learning
Cite: B. Almogadwy, N. K. Taylor, and A. Burger, "Multimodal Machine Learning for 2D to 3D Mapping in Biomedical Atlases," Journal of Image and Graphics, Vol. 10, No. 2, pp. 64-69, June 2022.
Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.
Copyright © 2012-2022 Journal of Image and Graphics, All Rights Reserved