Enhancing generalization and fairness in machine learning: Integrative learning approaches for nanophotonics and educational data mining

Onderzoeksoutput: PhD Thesis

Samenvatting

Artificial Intelligence (AI) has realized significant breakthroughs in fields ranging from healthcare and
education to agriculture and environmental science, offering solutions that once seemed beyond
reach. However, these advanced machine learning models face significant challenges. One key issue
is their struggle to generalize effectively beyond the specific data they were trained on, which can
lead to inaccurate outcomes. An example of this is a language model generating plausible but false
information. Additionally, these models can inadvertently perpetuate or even exacerbate biases in
their training data, leading to unfair outcomes in critical societal applications, such as skewed hiring
practices. Given these challenges, this research focuses on developing new machine learning
methodologies that enhance both AI systems' generalization capabilities and fairness.
The core contributions of this thesis include the development of novel models that optimize
accuracy while ensuring fairness across diverse datasets. Specifically, we explore applications in two
contrasting but increasingly data-driven fields: nanophotonics and educational data mining. The
research leverages simulation data to refine predictive models—using physics information and a
novel clustering algorithm—that more accurately account for the electromagnetic behavior of
nanophotonic devices. This is particularly effective in improving accuracy for outliers.
Methodologically, this work introduces innovative strategies that integrate different learning
algorithms into models to handle the variability and complexities of the data in educational
settings. The thesis also sheds light on the limitations of current debiasing methods. It introduces a
new bias mitigation approach enhancing fairness under varied conditions of the awareness of
protected attributes. Through a detailed analysis of these applications, this work advances our
understanding of how machine learning can be adapted to better handle diversity in datasets and
showcases practical implementations that could be adopted in other data-driven fields.
Originele taal-2English
Toekennende instantie
  • Vrije Universiteit Brussel
Begeleider(s)/adviseur
  • Vandersteen, Gerd, Promotor
  • Ferranti, Francesco, Co-Promotor
Datum van toekenning22 okt. 2024
StatusPublished - 2024

Citeer dit