Advancing the prediction and understanding of coding variant effects.

Project Details

Description

The available data on human genome variation is growing exponentially. Although the computational methodology that identifies which of these variants cause disease is being developed at great speed, the resulting tools give varying results and only tend to work well on cases where single variants are strongly correlated to disease. It therefore remains essential to increase the reliability of these methods, as well as our understanding of them.

With this project I focus on identifying disease-causing variants from the pool of genome variation, in particular on non-synonymous coding variants that change the amino acid sequence of expressed proteins. Methods working on this level use amino acid residue-specific features to identify deleterious mutants; we want to increase our understanding of which features work in which context. I also relate these mutants to protein structure, dynamics and stability information in order to understand why they have their particular effect. This project has the advantage that it introduces novel features related to proteins that have never been incorporated before, as for instance sequence-based contact prediction, the work of my masters project, and novel methods to predict protein dynamics and stability, as for example from my promotor Wim Vranken (SBB, VUB). Especially the decreased reliance on experimental structure or homologous model availability makes these novel features more generally applicable. I will also attempt to understand why the features are predictive: which features work in conjunction, are independent, or counter each other, and in which protein context do they do this? This work will exploit different relevance measures based on for instance information theoretical and cooperative game theory-based approaches, together with co-promotor Tom Lenaerts (AI group, VUB). The understanding gained from these analyses should help to further improve the quality of the predictors as it will allow for the incorporation of these novel insights into the predictive models. The newly designed predictors will immediately be validated using the complex medical setting the Brugada syndrome for which exome data is available, in collaboration with co-promotor Sonia van Dooren at the UZ Brussel. As it was observed that a single variant does not explain the disease, the long-term aim here is also to examine how the predictors can be adjusted to examine variant combinations.

Through my work I therefore want to further the reliable identification of deleterious mutants, provide insights for medical applications, and in the process also better understand why some features are predictive, depending on their protein context.
AcronymOZR2539
StatusFinished
Effective start/end date1/10/1330/09/15

Flemish discipline codes

  • Biological sciences

Keywords

  • Applied biology