TY - JOUR
T1 - Document Image Understanding
T2 - Computational Image Processing in the Cultural Heritage Sector
AU - Lu, Tan
AU - Dooms, Ann
PY - 2022/8/17
Y1 - 2022/8/17
N2 - Textual documents, such as manuscripts, historical newspapers etc., make up an important part of our cultural heritage. Massive digitization projects have been conducted across the globe for a better preservation of, and for providing easier access to such, often vulnerable, documents. These digital counterparts also allow to unlock the rich information contained inside and across them thanks to various types of computational models for document image understanding. In this paper we will shed a light on the document image processing pipeline, from scan to information extraction. As it turns out, human perceptual driven algorithms are amongst the most powerful approaches for generic document image understanding, required to deal with a myriad of layouts. In this context, we will in particular explain Gestalt visioning and the linked concept of text homogeneity that allows for enhanced layout analysis and even damage recognition, especially relevant in a cultural heritage setting. We conclude with a recent promising development, namely joint visual and language processing, that will take document image understanding to the next level in the future.
AB - Textual documents, such as manuscripts, historical newspapers etc., make up an important part of our cultural heritage. Massive digitization projects have been conducted across the globe for a better preservation of, and for providing easier access to such, often vulnerable, documents. These digital counterparts also allow to unlock the rich information contained inside and across them thanks to various types of computational models for document image understanding. In this paper we will shed a light on the document image processing pipeline, from scan to information extraction. As it turns out, human perceptual driven algorithms are amongst the most powerful approaches for generic document image understanding, required to deal with a myriad of layouts. In this context, we will in particular explain Gestalt visioning and the linked concept of text homogeneity that allows for enhanced layout analysis and even damage recognition, especially relevant in a cultural heritage setting. We conclude with a recent promising development, namely joint visual and language processing, that will take document image understanding to the next level in the future.
U2 - 10.1109/MBITS.2022.3199678
DO - 10.1109/MBITS.2022.3199678
M3 - Article
SP - 1
EP - 13
JO - IEEE BITS the Information Theory Magazine
JF - IEEE BITS the Information Theory Magazine
SN - 2692-4110
ER -