Adaptive removal of background and white space from document images using seam categorization

Claude Fillion, Zhigang Fan, Vishal Monga

    Research output: Chapter in Book/Report/Conference proceedingConference contribution


    Document images are obtained regularly by rasterization of document content and as scans of printed documents. Resizing via background and white space removal is often desired for better consumption of these images, whether on displays or in print. While white space and background are easy to identify in images, existing methods such as naïve removal and content aware resizing (seam carving) each have limitations that can lead to undesirable artifacts, such as uneven spacing between lines of text or poor arrangement of content. An adaptive method based on image content is hence needed. In this paper we propose an adaptive method to intelligently remove white space and background content from document images. Document images are different from pictorial images in structure. They typically contain objects (text letters, pictures and graphics) separated by uniform background, which include both white paper space and other uniform color background. Pixels in uniform background regions are excellent candidates for deletion if resizing is required, as they introduce less change in document content and style, compared with deletion of object pixels. We propose a background deletion method that exploits both local and global context. The method aims to retain the document structural information and image quality.

    Original languageEnglish (US)
    Title of host publicationImaging and Printing in a Web 2.0 World II
    StatePublished - Mar 29 2011
    EventImaging and Printing in a Web 2.0 World II - San Francisco, CA, United States
    Duration: Jan 26 2011Jan 27 2011

    Publication series

    NameProceedings of SPIE - The International Society for Optical Engineering
    ISSN (Print)0277-786X


    OtherImaging and Printing in a Web 2.0 World II
    Country/TerritoryUnited States
    CitySan Francisco, CA

    All Science Journal Classification (ASJC) codes

    • Electronic, Optical and Magnetic Materials
    • Condensed Matter Physics
    • Computer Science Applications
    • Applied Mathematics
    • Electrical and Electronic Engineering


    Dive into the research topics of 'Adaptive removal of background and white space from document images using seam categorization'. Together they form a unique fingerprint.

    Cite this