Learning to read irregular text with attention mechanisms

Xiao Yang, Dafang He, Zihan Zhou, Daniel Kifer, Clyde Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Citations (Scopus)

Abstract

We present a robust end-to-end neural-based model to attentively recognize text in natural images. Particularly, we focus on accurately identifying irregular (perspectively distorted or curved) text, which has not been well addressed in the previous literature. Previous research on text reading often works with regular (horizontal and frontal) text and does not adequately generalize to processing text with perspective distortion or curving effects. Our work proposes to overcome this difficulty by introducing two learning components: (1) an auxiliary dense character detection task that helps to learn text specific visual patterns, (2) an alignment loss that provides guidance to the training of an attention model. We show with experiments that these two components are crucial for achieving fast convergence and high classification accuracy for irregular text recognition. Our model outperforms previous work on two irregular-text datasets: SVT-Perspective and CUTE80, and is also highly-competitive on several regular-text datasets containing primarily horizontal and frontal text.

Original languageEnglish (US)
Title of host publication26th International Joint Conference on Artificial Intelligence, IJCAI 2017
EditorsCarles Sierra
PublisherInternational Joint Conferences on Artificial Intelligence
Pages3280-3286
Number of pages7
ISBN (Electronic)9780999241103
StatePublished - Jan 1 2017
Event26th International Joint Conference on Artificial Intelligence, IJCAI 2017 - Melbourne, Australia
Duration: Aug 19 2017Aug 25 2017

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Other

Other26th International Joint Conference on Artificial Intelligence, IJCAI 2017
CountryAustralia
CityMelbourne
Period8/19/178/25/17

Fingerprint

Text processing
Experiments

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Cite this

Yang, X., He, D., Zhou, Z., Kifer, D., & Giles, C. L. (2017). Learning to read irregular text with attention mechanisms. In C. Sierra (Ed.), 26th International Joint Conference on Artificial Intelligence, IJCAI 2017 (pp. 3280-3286). (IJCAI International Joint Conference on Artificial Intelligence). International Joint Conferences on Artificial Intelligence.
Yang, Xiao ; He, Dafang ; Zhou, Zihan ; Kifer, Daniel ; Giles, Clyde Lee. / Learning to read irregular text with attention mechanisms. 26th International Joint Conference on Artificial Intelligence, IJCAI 2017. editor / Carles Sierra. International Joint Conferences on Artificial Intelligence, 2017. pp. 3280-3286 (IJCAI International Joint Conference on Artificial Intelligence).
@inproceedings{2235672003904757b244ad416e547320,
title = "Learning to read irregular text with attention mechanisms",
abstract = "We present a robust end-to-end neural-based model to attentively recognize text in natural images. Particularly, we focus on accurately identifying irregular (perspectively distorted or curved) text, which has not been well addressed in the previous literature. Previous research on text reading often works with regular (horizontal and frontal) text and does not adequately generalize to processing text with perspective distortion or curving effects. Our work proposes to overcome this difficulty by introducing two learning components: (1) an auxiliary dense character detection task that helps to learn text specific visual patterns, (2) an alignment loss that provides guidance to the training of an attention model. We show with experiments that these two components are crucial for achieving fast convergence and high classification accuracy for irregular text recognition. Our model outperforms previous work on two irregular-text datasets: SVT-Perspective and CUTE80, and is also highly-competitive on several regular-text datasets containing primarily horizontal and frontal text.",
author = "Xiao Yang and Dafang He and Zihan Zhou and Daniel Kifer and Giles, {Clyde Lee}",
year = "2017",
month = "1",
day = "1",
language = "English (US)",
series = "IJCAI International Joint Conference on Artificial Intelligence",
publisher = "International Joint Conferences on Artificial Intelligence",
pages = "3280--3286",
editor = "Carles Sierra",
booktitle = "26th International Joint Conference on Artificial Intelligence, IJCAI 2017",

}

Yang, X, He, D, Zhou, Z, Kifer, D & Giles, CL 2017, Learning to read irregular text with attention mechanisms. in C Sierra (ed.), 26th International Joint Conference on Artificial Intelligence, IJCAI 2017. IJCAI International Joint Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence, pp. 3280-3286, 26th International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, 8/19/17.

Learning to read irregular text with attention mechanisms. / Yang, Xiao; He, Dafang; Zhou, Zihan; Kifer, Daniel; Giles, Clyde Lee.

26th International Joint Conference on Artificial Intelligence, IJCAI 2017. ed. / Carles Sierra. International Joint Conferences on Artificial Intelligence, 2017. p. 3280-3286 (IJCAI International Joint Conference on Artificial Intelligence).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Learning to read irregular text with attention mechanisms

AU - Yang, Xiao

AU - He, Dafang

AU - Zhou, Zihan

AU - Kifer, Daniel

AU - Giles, Clyde Lee

PY - 2017/1/1

Y1 - 2017/1/1

N2 - We present a robust end-to-end neural-based model to attentively recognize text in natural images. Particularly, we focus on accurately identifying irregular (perspectively distorted or curved) text, which has not been well addressed in the previous literature. Previous research on text reading often works with regular (horizontal and frontal) text and does not adequately generalize to processing text with perspective distortion or curving effects. Our work proposes to overcome this difficulty by introducing two learning components: (1) an auxiliary dense character detection task that helps to learn text specific visual patterns, (2) an alignment loss that provides guidance to the training of an attention model. We show with experiments that these two components are crucial for achieving fast convergence and high classification accuracy for irregular text recognition. Our model outperforms previous work on two irregular-text datasets: SVT-Perspective and CUTE80, and is also highly-competitive on several regular-text datasets containing primarily horizontal and frontal text.

AB - We present a robust end-to-end neural-based model to attentively recognize text in natural images. Particularly, we focus on accurately identifying irregular (perspectively distorted or curved) text, which has not been well addressed in the previous literature. Previous research on text reading often works with regular (horizontal and frontal) text and does not adequately generalize to processing text with perspective distortion or curving effects. Our work proposes to overcome this difficulty by introducing two learning components: (1) an auxiliary dense character detection task that helps to learn text specific visual patterns, (2) an alignment loss that provides guidance to the training of an attention model. We show with experiments that these two components are crucial for achieving fast convergence and high classification accuracy for irregular text recognition. Our model outperforms previous work on two irregular-text datasets: SVT-Perspective and CUTE80, and is also highly-competitive on several regular-text datasets containing primarily horizontal and frontal text.

UR - http://www.scopus.com/inward/record.url?scp=85031934691&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85031934691&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85031934691

T3 - IJCAI International Joint Conference on Artificial Intelligence

SP - 3280

EP - 3286

BT - 26th International Joint Conference on Artificial Intelligence, IJCAI 2017

A2 - Sierra, Carles

PB - International Joint Conferences on Artificial Intelligence

ER -

Yang X, He D, Zhou Z, Kifer D, Giles CL. Learning to read irregular text with attention mechanisms. In Sierra C, editor, 26th International Joint Conference on Artificial Intelligence, IJCAI 2017. International Joint Conferences on Artificial Intelligence. 2017. p. 3280-3286. (IJCAI International Joint Conference on Artificial Intelligence).