AntiCP3: Prediction of Anticancer Proteins Using Evolutionary Information from Protein Language Models

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

AntiCP3: Prediction of Anticancer Proteins Using Evolutionary Information from Protein Language Models

Authors

Gupta, A.; Chauhan, M.; Tomer, R.; Raghava, G. P. S.

Abstract

A number of computational methods have been developed in the past for predicting anticancer peptides, including AntiCP and AntiCP2 from our group. While these tools have been widely used by the scientific community, they are not suitable for predicting anticancer proteins. In this study, we present AntiCP3, the first dedicated method for the prediction of anticancer proteins. All models were trained using five-fold cross-validation and evaluated on an independent dataset not used during training. Our initial analysis revealed distinct compositional differences between anticancer peptides and proteins, justifying the need for a separate prediction framework. We first implemented similarity-based approaches, which yielded moderate performance. Subsequently, we developed machine learning and deep learning models using conventional protein features, achieving a maximum AUC of 0.72. The performance improved to an AUC of 0.79 with the incorporation of evolutionary information through PSSM profiles. Further enhancement was observed when embeddings from a fine-tuned protein language model ESM-t33 were used, leading to a best AUC of 0.90. Finally, a hybrid approach combining BLAST with our machine learning model achieved an AUC of 0.91. To facilitate the scientific community, we have implemented AntiCP3 as both a web server and standalone software for the prediction of anticancer proteins (https://webs.iiitd.edu.in/raghava/anticp3/). We have also deployed our model at hugging face https://huggingface.co/raghavagps-group/anticp3.

Follow Us on

0 comments

Add comment