Projects per year
Abstract
Physical damages (such as torn-offs and scratches) are commonly seen in historical documents. Recognition of such damages is currently absent in digitization-and-information-extraction (DIE) systems but crucial for automatic document comprehension and exploitation. In this paper we propose a generic damage recognition (DR) method based on a joint global and local modeling of the text homogeneity (TH) pattern exhibited in document images. More specifically, a connected component (CC) based formulation is developed as a global homogeneity measure, where TH is characterized using a probabilistic graph model for a coarse recognition of damaged regions. A multi-resolution analysis (MRA) of TH is further developed for a granular within-CC recognition of damage pixels, where the disparity between damage and text pixels is characterized by exploiting neighborhood transitions. This enables the formulation of a local homogeneity measure, where the neighborhood transition around an individual pixel is modeled using the propagation of the approximation coefficients of a stationary wavelet transform (SWT). The proposed global and local homogeneity measures are integrated as a joint likelihood in a Bayesian model with a Markov random field (MRF) prior, where DR is formulated as a maximum a posterior (MAP) inference which is addressed using Markov Chain Monte Carlo (MCMC) sampling. The resulting algorithm is tested on a set of real-life historical newspaper images containing damages of varying size and shape. The performance of the algorithm is evaluated using both F-measures and the Intersection-over-Union (IoU) metric, where test results demonstrate the promising potential of the proposed method.
Original language | English |
---|---|
Article number | 108034 |
Journal | Pattern Recognition |
Volume | 118 |
DOIs | |
Publication status | Published - Oct 2021 |
Fingerprint
Dive into the research topics of 'Bayesian Damage Recognition in Document Images Based on a Joint Global and Local Homogeneity Model'. Together they form a unique fingerprint.Projects
- 1 Active
-
VLAAI1: Subsidie: Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen
1/07/19 → 31/12/24
Project: Applied