Projecten per jaar
Samenvatting
This paper explores the potential of multi-objective reinforcement
learning (MORL) in infrastructural maintenance planning,
an area traditionally dominated by single-objective RL approaches.
In this paper, we introduce the Multi-Objective Deep Centralized
Multi-Agent Actor Critic (MODCMAC), a MORL method for maintenance
planning capable of optimizing a policy with a known nonlinear
utility function under the Expected Scalarized Return (ESR)
criterion, while the state is only partially observable. Previous singleobjective
RL methods had to combine multiple objectives, such as
risk and cost, into a singular reward signal through reward-shaping.
In contrast, MODCMAC can optimize a policy for multiple objectives
directly, even when the utility function is non-linear. We
evaluated MODCMAC using a utility function based on the Failure
Mode, Effects, and Criticality Analysis (FMECA) methodology,
which used the failure probability and cost as input. The evaluation
was done within an environment requiring optimizing a maintenance
plan for a historical quay wall. The performance of MODCMAC
was compared against a Belief-State-Based (BSB) policy with deterministic
or stochastic action selection. Our findings indicate that
MODCMAC outperforms the BSB policy. The code can be found at
https://github.com/jesserem/MODCMAC.
learning (MORL) in infrastructural maintenance planning,
an area traditionally dominated by single-objective RL approaches.
In this paper, we introduce the Multi-Objective Deep Centralized
Multi-Agent Actor Critic (MODCMAC), a MORL method for maintenance
planning capable of optimizing a policy with a known nonlinear
utility function under the Expected Scalarized Return (ESR)
criterion, while the state is only partially observable. Previous singleobjective
RL methods had to combine multiple objectives, such as
risk and cost, into a singular reward signal through reward-shaping.
In contrast, MODCMAC can optimize a policy for multiple objectives
directly, even when the utility function is non-linear. We
evaluated MODCMAC using a utility function based on the Failure
Mode, Effects, and Criticality Analysis (FMECA) methodology,
which used the failure probability and cost as input. The evaluation
was done within an environment requiring optimizing a maintenance
plan for a historical quay wall. The performance of MODCMAC
was compared against a Belief-State-Based (BSB) policy with deterministic
or stochastic action selection. Our findings indicate that
MODCMAC outperforms the BSB policy. The code can be found at
https://github.com/jesserem/MODCMAC.
Originele taal-2 | English |
---|---|
Aantal pagina's | 8 |
Status | Published - 1 okt. 2023 |
Evenement | Multi-Objective Decision Making Workshop 2023 - Krakow, Poland Duur: 1 okt. 2023 → 1 okt. 2023 https://modem2023.vub.ac.be |
Conference
Conference | Multi-Objective Decision Making Workshop 2023 |
---|---|
Verkorte titel | MODeM 2023 |
Land/Regio | Poland |
Stad | Krakow |
Periode | 1/10/23 → 1/10/23 |
Internet adres |
Projecten
- 1 Actief
-
VLAAI1: Vlaams Artificiële Intelligentie Onderzoeksprogramma (VAIOP) – tweede cyclus
1/01/24 → 31/12/28
Project: Toegepast