TY - GEN
T1 - Detecting arbitrary oriented text in the wild with a visual attention model
AU - Huang, Wenyi
AU - He, Dafang
AU - Yang, Xiao
AU - Zhou, Zihan
AU - Kifer, Daniel
AU - Giles, C. Lee
N1 - Funding Information:
This work was funded by NSF grant CCF-1317560 and a GPU donation by NVIDIA.
Publisher Copyright:
© 2016 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2016/10/1
Y1 - 2016/10/1
N2 - Text embedded in images provides important semantic information about a scene and its content. Detecting text in an unconstrained environment is a challenging task because of the many fonts, sizes, backgrounds, and alignments of the characters. We present a novel attention model for detecting arbitrary oriented and curved scene text. Inspired by the attention mechanisms in the human visual system, our model utilizes a spatial glimpse network to processes the attended area and deploys a recurrent neural network that aggregates the information over time to determine the attention movement. Combining this with an off-the-shelf region proposal method, the model achieves the state-of-the-art performance on the highly cited ICDAR2013 dataset, and the MSRA-TD500 dataset which contains arbitrary oriented text.
AB - Text embedded in images provides important semantic information about a scene and its content. Detecting text in an unconstrained environment is a challenging task because of the many fonts, sizes, backgrounds, and alignments of the characters. We present a novel attention model for detecting arbitrary oriented and curved scene text. Inspired by the attention mechanisms in the human visual system, our model utilizes a spatial glimpse network to processes the attended area and deploys a recurrent neural network that aggregates the information over time to determine the attention movement. Combining this with an off-the-shelf region proposal method, the model achieves the state-of-the-art performance on the highly cited ICDAR2013 dataset, and the MSRA-TD500 dataset which contains arbitrary oriented text.
UR - http://www.scopus.com/inward/record.url?scp=84994626543&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84994626543&partnerID=8YFLogxK
U2 - 10.1145/2964284.2967282
DO - 10.1145/2964284.2967282
M3 - Conference contribution
AN - SCOPUS:84994626543
T3 - MM 2016 - Proceedings of the 2016 ACM Multimedia Conference
SP - 551
EP - 555
BT - MM 2016 - Proceedings of the 2016 ACM Multimedia Conference
PB - Association for Computing Machinery, Inc
T2 - 24th ACM Multimedia Conference, MM 2016
Y2 - 15 October 2016 through 19 October 2016
ER -