2026-06-04
2026-04-30
2026-02-27
Manuscript received January 12, 2026; revised January 27, 2026; accepted March 6, 2026; published June 17, 2026.
Abstract—Despite significant advances in Arabic Handwritten Word Recognition (AHWR), the specific contribution of sequential encoders remains unclear because most prior studies change several architectural elements simultaneously, making comparisons difficult to interpret. This work presents the first carefully controlled evaluation of Bidirectional Long Short-Term Memory (BiLSTM) and Transformer encoders within an identical Convolutional Neural Network-Connectionist Temporal Classification (CNN-CTC) framework. All external factors, including preprocessing, the truncated ResNet-50 backbone, CTC alignment, Word Beam Search (WBS) decoding, and dataset splits, are held constant to isolate the effect of the sequence modeling mechanism itself, a dimension not explicitly analyzed in previous literature. The study also evaluates three augmentation strategies, including uniform augmentation and frequency-aware schemes at both the word and character levels, which address the distributional imbalance inherent in Arabic handwriting datasets. Experiments on IFN/ENIT show that the Transformer consistently achieves higher accuracy, up to 98.91%-Character Accuracy (CAR) and 98.41%-Word Accuracy (WAR), while the BiLSTM offers substantially faster inference. These findings provide the first reproducible quantification of the accuracy-efficiency trade-off between recurrent and attention-based encoders for Arabic cursive handwriting under fully controlled conditions. Keywords—Arabic handwritten word recognition, ResNet-50, transformer, Bidirectional Long Short-Term Memory (BiLSTM), Connectionist Temporal Classification (CTC), Word Beam Search (WBS), data augmentation Cite: Imane Bounour, Alae Ammour, Ghizlane Khaissidi, and Mostafa Mrabti, "A Controlled Comparison of BiLSTM and Transformer Encoders for Arabic Handwritten Word Recognition," Journal of Image and Graphics, Vol. 14, No. 3, pp. 506-520, 2026. Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).