Home > Published Issues > 2020 > Volume 8, No. 4, December 2020 >

Suggestions of a Deep Learning Based Automatic Text Annotation System for Agricultural Sites Using GoogLeNet Inception and MS-COCO

Shinji Kawakura 1 and Ryosuke Shibasaki 2
1. Department of Technology Management for Innovation, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
2. Center for Spatial Information Science, The University of Tokyo, Kashiwa-shi, Chiba, Japan

Abstract—Image recognition methodologies for use by agricultural (agri-) workers, managers, technicians, researchers, and unliving targets (e.g., harvests, agri-tools) have attracted significant interest. Currently, the most common approaches use various real-time visual analyses and recorded data-based analyses at outdoor and indoor agri-sites. However, recent Artificial Intelligence (AI)-based studies have proposed diverse automatic camera-based awareness systems with text-annotation. Although some systems have included monitoring and identification tools for the aforementioned agri-fields, their captioning abilities and accuracy levels have been insufficient for practical usage. Thus, further improvements have increased the accuracy by incorporating computing based on recent deep learning methodologies, particularly utilizing recent open services provided by huge IT companies, such as Google or Microsoft. Deep learning based analysis systems sometimes pick up on and highlight hidden, subtle points that a human may fail to notice. Thus, we develop deep learning based auto-annotating systems for Japanese small- to middle-sized indoor and outdoor agri-working sites and workers. We use visual data sets with a variety of real and common Japanese-styled agri-tools. We statistically analyze the obtained data and compare the comments obtained from experienced agri-workers. Our results confirm the systems’ utility, validity, and limitations.

Index Terms—automatic annotating from pictures, neural image captioning, deep learning, TensorFlow, CNN, GoogLeNet inception, MS-COCO

Cite: Shinji Kawakura and Ryosuke Shibasaki, "Suggestions of a Deep Learning Based Automatic Text Annotation System for Agricultural Sites Using GoogLeNet Inception and MS-COCO," Journal of Image and Graphics, Vol. 8, No. 4, pp. 120-125, December 2020. doi: 10.18178/joig.8.4.120-125

Copyright © 2020 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.