Document Domain Randomization for Deep Learning Document Layout Extraction

Meng Ling, Jian Chen, Torsten Möller, Petra Isenberg, Tobias Isenberg, Michael Sedlmair, Robert S. Laramee, Han Wei Shen, Jian Wu, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present document domain randomization (DDR), the first successful transfer of CNNs trained only on graphically rendered pseudo-paper pages to real-world document segmentation. DDR renders pseudo-document pages by modeling randomized textual and non-textual contents of interest, with user-defined layout and font styles to support joint learning of fine-grained classes. We demonstrate competitive results using our DDR approach to extract nine document classes from the benchmark CS-150 and papers published in two domains, namely annual meetings of Association for Computational Linguistics (ACL) and IEEE Visualization (VIS). We compare DDR to conditions of style mismatch, fewer or more noisy samples that are more easily obtained in the real world. We show that high-fidelity semantic information is not necessary to label semantic classes but style mismatch between train and test can lower model accuracy. Using smaller training samples had a slightly detrimental effect. Finally, network models still achieved high test accuracy when correct labels are diluted towards confusing labels; this behavior hold across several classes.

Original languageEnglish (US)
Title of host publicationDocument Analysis and Recognition - ICDAR 2021 - 16th International Conference, Proceedings
EditorsJosep Lladós, Daniel Lopresti, Seiichi Uchida
PublisherSpringer Science and Business Media Deutschland GmbH
Pages497-513
Number of pages17
ISBN (Print)9783030865481
DOIs
StatePublished - 2021
Event16th International Conference on Document Analysis and Recognition, ICDAR 2021 - Lausanne, Switzerland
Duration: Sep 5 2021Sep 10 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12821 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Conference on Document Analysis and Recognition, ICDAR 2021
Country/TerritorySwitzerland
CityLausanne
Period9/5/219/10/21

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this