Application of Learning Automata for Stochastic Online Scheduling

Yailen Martinez Jimenez, Bert Van Vreckem, David Catteeuw, Ann Nowe

    Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

    1 Citation (Scopus)


    We look at a stochastic online scheduling problem where exact job-lenghts are unknown and jobs arrive over time. Heuristics exist which perform very well, but do not extend to multi-stage problems where all jobs must be processed by a sequence of machines.
    We apply Learning Automata (LA), a Reinforcement Learning technique, successfully to such a multi-stage scheduling setting. We use a Learning Automaton at each decision point in the production chain. Each Learning Automaton has a probability distribution over the machines it can chose. The difference with simple randomization algorithms is the update rule used by the LA. Whenever a job is finished, the LA are notified and update their probability distribution: if the job was finished faster than expected the probability for selecting the same action is increased, otherwise it is decreased.
    Due to this adaptation, LA can learn processing capacities of the machines, or more correctly: the entire downstream production chain.
    Original languageEnglish
    Title of host publicationThe 14th Belgian-French-German Conference on Optimization
    EditorsM. Diehl, F. Glineur, E. Jarlebring, W. Michiels
    Number of pages8
    ISBN (Print)978-3-642-12597-3
    Publication statusPublished - Sep 2010
    EventThe 14th Belgian-French-German Conference on Optimization -
    Duration: 7 Sep 2010 → …

    Publication series

    NameRecent Advances in Optimization and its Applications in Engineering


    ConferenceThe 14th Belgian-French-German Conference on Optimization
    Period7/09/10 → …

    Bibliographical note

    M. Diehl; F. Glineur; E. Jarlebring; W. Michiels


    • Reinforcement Learning
    • Scheduling


    Dive into the research topics of 'Application of Learning Automata for Stochastic Online Scheduling'. Together they form a unique fingerprint.

    Cite this