MMTFL: Multi-Timescale Multi-Modal Feature Learning for Weakly-Supervised Anomaly Detection

General Information

ISSN: 2301-3699 (Print); 2972-3973 (Online)
Frequency: Bimonthly
Managing Editor: Ms. Alice Loh
DOI: 10.18178/joig
Abstracting/Indexing: Scopus (Since 2021), CNKI, Google Scholar, Crossref, etc.
APC: 500 USD
Average Days to Accept: 116 days
Acceptance Rate: 38%
E-mail: editor@joig.net
Journal Metrics:
5.0

2024CiteScore

69rd percentile

Powered by

Editor-in-Chief

Dr. Branislav Vuksanovic
Deputy Head of Department, Systems Engineering Department, Military Technological College, Muscat, Oman
I am very excited to serve as the first Editor-in-Chief of the International Journal of Image and Graphics (JOIG) and hope that the publication can enrich the readers’ experience... [Read More]

What's New

2025-12-25

Volume 13, No. 6 has been published now.

2025-12-13

All papers published in Vol. 13, No. 5 have been indexed by SCOPUS.　

2025-10-07

All papers published in Vol. 13, No. 4 have been indexed by SCOPUS.

Home > Articles > All Issues > 2026 > Volume 14, No. 1, 2026 >

JOIG 2026 Vol.14(1):96-107
doi: 10.18178/joig.14.1.96-107

Erkut Akdag *, Henk Corporaal, Peter H. N. D. With, and Egor Bondarev

Electrical Engineering Department, Eindhoven University of Technology, Eindhoven, The Netherlands
Email: e.akdag@tue.nl (E.A.); h.corporaal@tue.nl (H.C.); p.h.n.de.with@tue.nl (P.H.N.D.W.); e.bondarev@tue.nl (E.B.)
*Corresponding author

Manuscript received May 19, 2025; revised July 18, 2025; accepted September 1, 2025; published February 27, 2026.

Abstract—Detection of anomalous events is critical for public safety and requires capturing fine-grained motion patterns and contextual information across multiple time-scales. To this end, we propose a Multi-Timescale Feature Learning (MTFL) method to enhance the representation of anomaly features. Short, medium, and long temporal tubelets are employed to extract spatio-temporal video features using a Video Swin Transformer. Experimental results demonstrate that MTFL achieves an anomaly detection performance 87.16% Area Under the Curve (AUC) on the University of Central Florida (UCF)-Crime dataset and 84.57% Average Precision (AP) on the Xi Dian University (XD)-Violence dataset. While MTFL relies solely on spatio-temporal features extracted from a single modality using RGB video, it encounters challenges such as occlusions, ambiguous actions, and limited contextual understanding. To overcome these limitations, we also propose Multi-Modal Multi-Timescale Feature Learning (MMTFL), which integrates spatiotemporal, depth, and text-based features in conjunction with multi-timescale tubelet analysis, rather than focusing only on RGB inputs. Although adding modalities increases feature extraction cost, it remains feasible for real-world purposes. Experimental results demonstrate that the MMTFL outperforms single-modality approaches, achieving 88.29% AUC on the UCF-Crime dataset and 84.96% AP on the XDViolence dataset. By leveraging complementary information from multiple modalities, the proposed approach achieves more robust and accurate detection of complex and diverse anomalies compared to single-modal methods.

Keywords—anomaly detection, surveillance videos, video understanding, multi-modality, feature fusion, attention

Cite: Erkut Akdag, Henk Corporaal, Peter H. N. D. With, and Egor Bondarev, "MMTFL: Multi-Timescale Multi-Modal Feature Learning for Weakly-Supervised Anomaly Detection," Journal of Image and Graphics, Vol. 14, No. 1, pp. 96-107, 2026.

Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC-BY-4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.

附件说明

Article Metrics in Dimensions

PREVIOUS PAPER

Edge Detection Using Clip ReLU-Based Enhanced Hybrid Network

NEXT PAPER

Convolutional Neural Networks for Non-parasitic Nematode Feeding Behavior Identification in Soil Ecosystem Management

Home

Articles

Author Guide

Editor Guide

Reviewer Guide

Topics and Special Issues

journal menu