Home > Articles > All Issues > 2026 > Volume 14, No. 2, 2026 >
JOIG 2026 Vol.14(2):208-229
doi: 10.18178/joig.14.2.208-229

Hierarchical Multi-scale Transformer for Breast Tumor Staging with Visual Interpretability and Risk Stratification

Satyanarayana Reddy Beram 1,*, R Lalchhanhima 1, and Ksh. Robert Singh 2
1. Department of Information Technology, Mizoram University, Mizoram, India
2. Department of Electrical Engineering, Mizoram University, Mizoram, India
Email: mzu22007898@mzu.edu.in (S.R.B.); chhana.mizo@gmail.com (R.L.); robert_kits@yahoo.co.in (K.R.S.)
*Corresponding author

Manuscript received September 18, 2025; revised November 7, 2025; accepted December 10, 2025; published March 26, 2026.

Abstract—Accurate tumor staging and risk stratification in breast cancer are critical for guiding treatment decisions. Traditional decision-making methods rely on Whole Slide Images (WSIs) analysis, which is labor-intensive and subject to inter-observer variability. To address these challenges in tumor staging, we propose a modular and interpretable deep learning framework for automated tumor staging through multi-resolution histopathological analysis. Our Hierarchical Multi-Scale Transformer (HMS-T) integrates the Vision Transformers (ViTs) operating at 5×, 10×, and 20× magnifications to capture both the cellular and architectural features. In addition, a novel cross-scale attention fusion module combines these multi-scale resolution representations, for enabling robust prediction of the American Joint Committee on Cancer (AJCC) stage group labels recorded in the clinical data from primary-tumor WSIs. Trained on 1092 patients from The Cancer Genome Atlas–Breast Invasive Carcinoma (TCGA-BRCA) cohort, our HMS-T achieves state-of-the-art performance with a staging accuracy of 91.5%, a macro F1-Score of 0.89, and a quadratic-weighted kappa of 0.92, which demonstrates the strong agreement with pathological standards. Moreover, our model’s attention maps exhibit high spatial interpretability, aligning closely with expert-annotated regions (dice score = 0.81). Ultimately, we introduced a lightweight clinical extension for the preliminary survival risk stratification, achieving a concordance index of 0.74, thereby bridging toward full prognostic modeling. By combining high performance with transparent decisionmaking, HMS-T represents a significant advancement toward deployable Artificial Intelligence (AI)-assisted pathology tools for breast cancer.

Keywords—multi-scale histopathology, vision transformers, cross-scale attention fusion, American Joint Committee on Cancer (AJCC) tumour staging, survival risk stratification, Whole Slide Images (WSIs), deep learning

Cite: Satyanarayana Reddy Beram, R Lalchhanhima, and Ksh. Robert Singh, "Hierarchical Multi-scale Transformer for Breast Tumor Staging with Visual Interpretability and Risk Stratification," Journal of Image and Graphics, Vol. 14, No. 2, pp. 208-229, 2026.

Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC-BY-4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.

Article Metrics in Dimensions