TY - JOUR
T1 - The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens
AU - CAFA Challenge
AU - Zhou, Naihui
AU - Jiang, Yuxiang
AU - Bergquist, Timothy R.
AU - Lee, Alexandra J.
AU - Kacsoh, Balint Z.
AU - Crocker, Alex W.
AU - Lewis, Kimberley A.
AU - Georghiou, George
AU - Nguyen, Huy N.
AU - Hamid, Md Nafiz
AU - Davis, Larry
AU - Dogan, Tunca
AU - Atalay, Volkan
AU - Rifaioglu, Ahmet S.
AU - Dalklran, Alperen
AU - Cetin Atalay, Rengul
AU - Zhang, Chengxin
AU - Hurto, Rebecca L.
AU - Freddolino, Peter L.
AU - Zhang, Yang
AU - Bhat, Prajwal
AU - Supek, Fran
AU - Fernández, José M.
AU - Gemovic, Branislava
AU - Perovic, Vladimir R.
AU - Davidović, Radoslav S.
AU - Sumonja, Neven
AU - Veljkovic, Nevena
AU - Asgari, Ehsaneddin
AU - Mofrad, Mohammad R.K.
AU - Profiti, Giuseppe
AU - Savojardo, Castrense
AU - Martelli, Pier Luigi
AU - Casadio, Rita
AU - Boecker, Florian
AU - Schoof, Heiko
AU - Kahanda, Indika
AU - Thurlby, Natalie
AU - McHardy, Alice C.
AU - Renaux, Alexandre
AU - Saidi, Rabie
AU - Gough, Julian
AU - Freitas, Alex A.
AU - Antczak, Magdalena
AU - Fabris, Fabio
AU - Wass, Mark N.
AU - Hou, Jie
AU - Cheng, Jianlin
AU - Zhang, Zihan
AU - Liu, Yi Wei
PY - 2019/11/19
Y1 - 2019/11/19
N2 - Background: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. Results: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-Term memory. Conclusion: We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.
AB - Background: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. Results: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-Term memory. Conclusion: We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.
KW - Biofilm
KW - Community challenge
KW - Critical assessment
KW - Long-Term memory
KW - Protein function prediction
UR - http://www.scopus.com/inward/record.url?scp=85075272104&partnerID=8YFLogxK
U2 - 10.1186/s13059-019-1835-8
DO - 10.1186/s13059-019-1835-8
M3 - Article
C2 - 31744546
AN - SCOPUS:85075272104
VL - 20
JO - Genome Biology
JF - Genome Biology
SN - 1474-7596
IS - 1
M1 - 244
ER -