Home > Articles > All Issues > 2026 > Volume 14, No. 3, 2026 >
JOIG 2026 Vol.14(3):444-453
doi: 10.18178/joig.14.3.444-453

Comparative Analysis of Loss Functions for Semantic Segmentation: An Empirical Study on Cityscapes Dataset

Windra Swastika
Faculty of Technology and Design, Universitas Ma Chung, Malang, Indonesia
Email: windra.swastika@machung.ac.id

Manuscript received September 11, 2025; revised October 10, 2025; accepted December 30, 2025; published June 12, 2026.

Abstract—Semantic segmentation remains a fundamental challenge in computer vision, where the choiceand weightingof loss functions significantly impactmodel performance. This study presents a comprehensive comparative analysis of individual versus combined loss functionswith systematic weight ablationfor semantic segmentation using modified Attention U-Netand DeepLabV3+ architectureson the Cityscapes dataset. We systematically evaluateseven weight configurations across three loss components (Cross-Entropy, Dice, Focal) through rigorous ablation studies,and validate our findings across two architectures to ensure generalizability. Throughextensiveexperimentation across 20 epochs with 2,975 training and 500 validation images, our results demonstrate that theDice-dominant weighting configuration (0.5:1.0:0.5 for CE:Dice:Focal) achieves superior performance with57.83% mean Intersection over Union (mIoU)on Attention U-Net and58.35% mIoUon DeepLabV3+, representing7.78% improvementover the best individual loss function.Comprehensive ablation studies reveal that weight configuration critically affects performance, with Dice-dominant weighting consistently outperforming equal weighting (55.59% mIoU) and individual loss functions.Qualitative analysis demonstrates substantial improvements in boundary delineation and small object detection, with boundary IoU improving by 1.41% and challenging class performance (trucks, pedestrians) improving by 5–21%. Statistical analysis reveals that Cross-Entropy provides the most efficient training with a 75.4% loss reduction, while Dice loss exhibits convergence challenges, resulting inonly a 34.5% reduction. Our findings conclusivelydemonstratethatoptimized combined loss function weighting achieves better segmentation performance than both individual approaches and naive equal weighting strategies,with consistent improvements across different network architectures.

Keywords—semantic segmentation,loss function weighting, ablation study,Attention U-Net,DeepLabV3+,Cityscapes dataset, computer vision, deep learning

Cite: Windra Swastika, "Comparative Analysis of Loss Functions for Semantic Segmentation: An Empirical Study on Cityscapes Dataset," Journal of Image and Graphics, Vol. 14, No. 3, pp. 444-453, 2026.

Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Article Metrics in Dimensions