Description
Deep neural networks achieve high performance in many signal processing tasks. However, they are often considered as black-box models. To address this issue, there is a growing interest in developing interpretable models, along with methods to explain the model decisions. These techniques were successful in text or image processing tasks, but their application to video is limited. In this project, we propose a joined framework for interpretable and explainable deep learning for video. First, we consider the lowcomplexity data structure inherent to video, both in the image and temporal domain to learn video representations. Our approach is based on the deep unfolding framework, leading tointerpretability by design. Considered applications include inverse problems, foreground separation and frame prediction. Secondly, we propose a novel recurrent model based on state-space model assumptions and deep Kalman filtering in order to learn interpretable video dynamics and semantics. Applications include inverse problems and object tracking. Thirdly, we propose trustworthy post-hoc analysis methods to explain model decisions. Input saliency techniques will be used to generate disentangled visual and temporal explanation. We leverage the structures and dynamics captured by our networks to bridge interpretability and explainability, by relating the output and input
spaces to the interpretable latent space. For the proposed applications, we expect to reach state-of-the-art performance.
Period | 1 Jan 2020 → 1 Aug 2020 |
---|---|
Held at | Research Foundation - Flanders, Belgium |
Degree of Recognition | National |
Related content
-
Projects
-
Interpretable and Explainable Deep Learning for Video Processing
Project: Fundamental