In this paper, an algorithm is developed for segmenting document images into four classes: background, photograph, text, and graph. Features used for classification are based on the distribution patterns of wavelet coefficients in high frequency bands. Two important attributes of the algorithm are its multiscale nature - it classifies an image at different resolutions adaptively, enabling accurate classification at class boundaries as well as fast classification overall - and its use of accumulated context information for improving classification accuracy.
All Science Journal Classification (ASJC) codes
- Computer Graphics and Computer-Aided Design