The fundamental subject of molecular biology is information and more specifically the way in which
information is encoded, transformed and perpetuated in living cells and organisms. DNA structure carries
evolutionary selected information on how to build proteins, whereas protein conformational and assembly states
hold the environmental information on which cells have to act in real-time (1). Notwithstanding its central role,
we still do not study the processing of information by biological systems in a direct and quantitative manner.
In answer to this issue an in silico framework was developed (2) by the promotor of this project in the
last year that uses Shannon's information theory (3), which was originally defined to study compression and
accurate communication. This prototype uses the association between information theory and statistical
mechanics (4) to identify and quantify the biological information inherent to the collection of energy-favourable
residue conformations in protein structures. As such, we combine molecular biology and computer science
research to improve our understanding of information processing capacity of proteins and the complexes they
belong to. In the Switch signalling team this multidisciplinar expertise is available to address the questions
concerning information processing in cells. Furthermore, this cross-fertilization is further encouraged with
benefits for both biological and computational research.
Thus by combining this in house expertise, a framework was developed that serves as a proof-ofconcept
to confirm the hypothesis that protein domains are information processing (or computational) units.
The system assumes that the amino acids in protein structures are thermodynamically coupled into a network
where changes in the conformational states of the amino acid sidechains play a role in the propagation of
information through the protein structure. An analysis of the SH2 domain has shown that binding a
phosphorylated peptide to the domain results in the transmission of information from the binding cavities to
some specific residues located at the opposite side of the protein structure (2). In other words, the framework
identifies and quantifies long-distance communication in the SH2 domain. As such, our system confirms that
allostery (which is defined as a coupling between binding sites over long distance) is not restricted to large
structural changes alone (5, 6): Even simple domains like SH2, SH3 and PDZ domains explicitly process
information, transmitting it to either neighbouring domains or other proteins bound to their surface.
The validation of these results is performed through the analysis of point mutations of residues that
belong to the information channel inside the domain structure. This experimental validation is performed both
in house and with other national and international groups who are specializing in particular domains like SH2,
SH3 and PDZ domains. In order to perform this validation properly and to make the current prototype more
robust, some additional in silico steps have to be performed (see project description). Once these steps are
completed, the resulting framework will provide a complete computational environment to study the
information processing capacity of proteins and using this environment accurate experimental studies can be
performed. The signalling team in Switch has set up the necessary environment and international contacts to
successfully complete the interdisciplinary project we want to perform here.