Life is Random, Time is Not: Markov Decision Processes with Window Objectives

Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)
88 Downloads (Pure)

Abstract

The window mechanism was introduced by Chatterjee et al. to strengthen classical game objectives with time bounds. It permits to synthesize system controllers that exhibit acceptable behaviors within a configurable time frame, all along their infinite execution, in contrast to the traditional objectives that only require correctness of behaviors in the limit. The window concept has proved its interest in a variety of two-player zero-sum games because it enables reasoning about such time bounds in system specifications, but also thanks to the increased tractability that it usually yields. In this work, we extend the window framework to stochastic environments by considering Markov decision processes. A fundamental problem in this context is the threshold probability problem: given an objective it aims to synthesize strategies that guarantee satisfying runs with a given probability. We solve it for the usual variants of window objectives, where either the time frame is set as a parameter, or we ask if such a time frame exists. We develop a generic approach for window-based objectives and instantiate it for the classical mean-payoff and parity objectives, already considered in games. Our work paves the way to a wide use of the window mechanism in stochastic models.

Original languageEnglish
Pages (from-to)13:1-13:30
Number of pages30
JournalLogical Methods in Computer Science
Volume16
Issue number4
DOIs
Publication statusPublished - 14 Dec 2020

Bibliographical note

Funding Information:
Research supported by F.R.S.-FNRS under Grant n◦ F.4520.18 (ManySynth), and F.R.S.-FNRS mobility funding for scientific missions (Y. Oualhadj in UMONS, 2018). Florent Delgrange is now affiliated with Vrije Universiteit Brussel, Belgium. Mickael Randour is an F.R.S.-FNRS Research Associate. Tool support. Thanks to its low complexity and its adequacy w.r.t. applications, our window framework lends itself well to tool development. We are currently building a tool suite for MDPs with window objectives based on the main results of this paper along with the aforementioned extensions. Our aim is to provide a dedicated extension of Storm, a cutting-edge probabilistic model checker [DJKV17].

Funding Information:
Key words and phrases: Markov decision processes, window mean-payoff, window parity. ∗ Research supported by F.R.S.-FNRS under Grant n◦ F.4520.18 (ManySynth), and F.R.S.-FNRS mobility funding for scientific missions (Y. Oualhadj in UMONS, 2018). Florent Delgrange is now affiliated with Vrije Universiteit Brussel, Belgium. Mickael Randour is an F.R.S.-FNRS Research Associate.

Publisher Copyright:
© T. Brihaye, F. Delgrange, M. Randour, and Y. Oualhadj.

Keywords

  • Computer Science
  • Logic in Computer Science
  • Artificial Intelligence
  • Formal Languages and Automata Theory
  • Game Theory
  • Mathematics - Probability

Fingerprint

Dive into the research topics of 'Life is Random, Time is Not: Markov Decision Processes with Window Objectives'. Together they form a unique fingerprint.

Cite this