Samenvatting
We look at an application based on the chemical production plant of Castillo and Roberts. It is a two-stage process with four times two parallel machines.
Each order must be handled first by a 'stage-one' machine and afterwards by a 'stage-two' machine. At each stage, a scheduler must choose between two parallel machines. Parallel machines can handle the same type of tasks, but may differ in speed. The schedulers' objective is to fulfill a sequence of product orders in the least amount of time. This means we want to minimize the make-span.
The action and state space grow very large when the above scheduling problem would be controlled in a centralized way. A better alternative is a multi-agent approach. In this way, one large problem is divided into different smaller problems. Each of these problems are solved by a single agent using his local/private information. This, however, creates a new problem in the multi-agent setting. From the viewpoint of one agent, the outcome/success of his actions can also depend on the actions of other agents.
In our solution, agents decide which machine processes the next job. After each 'stage-one' machine, an agent selects a machine from stage-two.
The learning techniques which we apply to this problem are Learning Automata and Q-Learning. In previous work we used Q-Learning to the problem of selecting parallel machines and multi-stage scheduling separately.
We measure the efficiency of the (online) algorithms by the ratio of time needed to process a predefined series of orders and time needed by the optimal offline algorithm.
Each order must be handled first by a 'stage-one' machine and afterwards by a 'stage-two' machine. At each stage, a scheduler must choose between two parallel machines. Parallel machines can handle the same type of tasks, but may differ in speed. The schedulers' objective is to fulfill a sequence of product orders in the least amount of time. This means we want to minimize the make-span.
The action and state space grow very large when the above scheduling problem would be controlled in a centralized way. A better alternative is a multi-agent approach. In this way, one large problem is divided into different smaller problems. Each of these problems are solved by a single agent using his local/private information. This, however, creates a new problem in the multi-agent setting. From the viewpoint of one agent, the outcome/success of his actions can also depend on the actions of other agents.
In our solution, agents decide which machine processes the next job. After each 'stage-one' machine, an agent selects a machine from stage-two.
The learning techniques which we apply to this problem are Learning Automata and Q-Learning. In previous work we used Q-Learning to the problem of selecting parallel machines and multi-stage scheduling separately.
We measure the efficiency of the (online) algorithms by the ratio of time needed to process a predefined series of orders and time needed by the optimal offline algorithm.
Originele taal-2 | English |
---|---|
Titel | Book of Abstracts of the 14th Belgian-French-German Conference on Optimization |
Redacteuren | J. De Schutter, M. Diehl, F. Glineur, E. Jarlebring, Q. Louveaux, W. Michiels, I. Smets, J. Suykens, J. Swevers, J. Vandewalle, J. Van Impe, M. Verschuure |
Uitgeverij | Instituut voor Mediastudies (KU Leuven) |
Pagina's | 162-162 |
Aantal pagina's | 1 |
ISBN van geprinte versie | 978-90-73802-00-1 |
Status | Published - sep 2009 |
Publicatie series
Naam | Belgian-French-German Conference on Optimization |
---|