Semantic-aware document image processing has been neglected for the past decade in developing human-like AI systems for advanced document mining technologies such as information extraction on complex layouts. Despite versatile layouts, humans rely on perceptual-cognitive skills to comprehend document content. Thus, semantic-aware document processing can be developed by simulating human perceptual document recognition using joint visual and language processing. The visual data such as document-level information (e.g. relative position in the semantic sparse) are exposed to the language representation using the 4-connectivity label pattern (left, right, up, down). A transformer-based architecture is designed to learn this language representation and the inference performance shows promising results on tables with complex layouts.
Original languageEnglish
Publication statusPublished - 21 Jun 2022
EventAI Flanders Research Days - Irish College, Leuven, Leuven, Belgium
Duration: 21 Jun 2022 → …


WorkshopAI Flanders Research Days
Period21/06/22 → …
Internet address


  • Document Mining
  • Semantics-aware
  • Graph
  • NLP
  • Computer Vision
  • Information extraction


Dive into the research topics of 'A Perception-Inspired Intelligent Document Parsing and Comprehension Model'. Together they form a unique fingerprint.

Cite this